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Method for the expression of unknown environmental DNA into adapted host cells 



Introduction and Bac kground 

5 The present invention relates to methods and compositions for nucleic acid production, 
analysis and cloning. The present invention discloses tools and methods for the 
production and analysis of libraries of polynucleotides, particularly metagenomic 
libraries, which can be used to identify novei pathways, novel enzymes and novel 
metabolites of interest in various areas, including pharmaceutical, cosmetic, 

10 agrochemical and/or food industry. 

Drug discovery process is based on two main fields, namely combinatorial chemistry and 
natural products. Combinatorial chemistry has shown its ability to generate huge 
amounts of molecules, but with limited chemical diversity. At the opposite, natural 
15 products have been the most predominant source of structural and molecular diversity. 
However, the exploitation of this diversity is strongly hampered by their limited access, 
complex identification and purification processes, as well as by their production. 

Microorganisms are known to synthesize a large diversity of natural compounds which 
20 are already widely used in therapeutic, agriculture, food and industrial areas. However, 
this promising approach to the identification of new natural compounds has always been 
considerably limited by the principal technological bolts of isolating and in vitro 
propagating the huge diversity of bacteria. Most microorganisms living in a natural, 
complex environment (soil, digestive tract, sea, etc. . .) have not been cultivated because 
25 their optimal living conditions are either unknown or difficult to reproduce. Numbers of 
scientific publications relate this fact and it is now assumed that less than about 1% of 
the total bacterial diversity (when all environments are considered together) have been 
isolated and cultivated (Amann et al, 1995). 

30 New approaches have been developed to try to overpass the critical step of isolation, and 
to access directly to the huge genetic potential established by the microbial adaptation 
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processes through their long evolution. These approaches are called "Metagenomic" 
because they address a plurality of genomes of a whole bacterial community, without any 
distinction (metagenome). 

5 Metagenomics involve direct extraction of DNAs from environmental samples and their 
propagation and expression into a cultivated host cell, typically a bacteria. Metagenomic 
has been firstly developed for the identification of new bacterial phylum (Pace .1997). 
This use is based on the specific cloning of genes recognized for their interest as 
phylogenetic markers, such as 16S rDNA genes. Further developments of Metagenomics 
10 relate to the detection and cloning of genes coding for proteins with environmental or 
industrial interest. These first two applications of metagenomic involve a first step of 
gene selection (generally using PCR) before cloning. In the case of protein production, 
the cloning vector used are preferentially also expression vectors, i.e., they contain 
regulatory sequences upstream of the cloning site causing expression of the cloned gene 
15 in a given bacterial host strain. 

More recent developments of metagenomic consider the total metagenome cloned 
without any selection and/or identification, to establish random "Metagenomic DNA 
libraries". This provides an access to the whole genetic potential of bacterial diversity 
without any "a prior? selection. Metagenomic DNA libraries are composed of hundreds 
20 of thousands of clones which differ from each other by the environmental DNA 
fragments which have been cloned. In this respect, large DNA fragments have been 
cloned (more man 30 Kb), so as to (i) limit the number of clones which have to be 
analysed and (ii) to be able to recover whole biosynthetic pathways for the identification 
of new metabolites resulting from multi enzymatic synthesis. This last point is of 
25 particular interest for bacterial metagenomic libraries since, most the biosynthetic 
pathways have been found to be naturally organised in a same cluster of DNA and even 
in the same operon in bacteria. Nevertheless, the heterologous expression of a whole 
biosynthetic pathways (large DNA fragment) needs a much more improved system than 
a simple expression vector to have a full and stable expression. 
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Except for the identification and characterisation of bacterial community at the 
phylogenetic or diversity levels, metagenomic libraries produced in the prior art are gene 
expression libraries, i.e., the environmental DNA fragments are cloned downstream of a 
functional promoter, to allow their expression and analysis. In this regard, W099/45154 
5 and W096/34112 relate to combinatorial gene expression libraries which comprise a 
pool of expression constructs where each expression construct contains DNA which is 
operably associated with one or more regulatory regions that drive expression of genes in 
an appropriate organisms. Furthermore, the expression constructs used in these methods 
have a very limited and invariable host range. Similarly, WO 01/40497 relates to the 
10 construction and use of expression vectors which can be transferred in one chosen 
expression bacterial host of the Streptomyces genus. All these approaches are, however, 
very limited since they require the presence of expression signals and confer invariable 
or very limited host range capabilities. Furthermore, most (if not all) metagenomic DNA 
libraries have been established in E. coli which is the most efficient cloning system. 
1 5 However, most environmental DNA are not expressed or functionally active in E. coli. In 
particular, functional analysis in E. coli of genes cloned from G+C rich organisms, such 
as Actinomyces, could be limited by the lack of adequate transcription and translation 
system. Also, posttranslational modification system in E. coli is not operative on 
heterologous proteins from Actinomicetes and some specific substrates for proteins 
20 activity are not present in E. coli. 

The stable maintenance of large foreign DNA fragments (> 10Kb) into a selected host 
cell is one of the key points for academic research or applied industrial purposes. 
Usually, the vector carrying the foreign DNA is maintained by cultivating the host cells 

25 in a medium with a vector-specific selective pressure (resistance to an antibiotic for 
example). However, when large foreign DNA fragments are cloned and/or expressed, 
their propagation and/or expression require energy, which is not allocated for cell growth 
anymore. As a consequence of this new resource allocation (nutrients/energy), it is not 
unusual to have a genetic rearrangement of the foreign DNA (deletion, modification 

30 etc..) as a recombinant cell reaction. This results in the modification of the foreign 
genetic information and in the loss of DNA functionality. This can be observed without 
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any loss of the selective pressure carried by the vector. As a result, the recombinant clone 
is no more exploitable for genetic or functional analysis. 

Thus, the exploitation of the huge potential of metagenomics for the discovery of new 
5 natural compounds, pathways or genes cannot be achieved with currently existing 
methods. Alternative technologies and processes must be developed, to allow stable 
maintenance and propagation of large foreign DNAs into host cells for production of 
efficient libraries and functional screening in a large variety of host cell species, 
including Bacillus or Streptomyces, to take full account of the huge diversity of the 
10 environmental DNAs. 

Summary of the Invention 

The present invention discloses improved tools and methods for the production and 
15 analysis of libraries of polynucleotides, particularly metagenomic libraries, which can be 
used to identify and produce novel pathways, novel enzymes and novel metabolites of 
interest. 

More particularly, the invention now proposes to keep the advantage of high efficient 
20 cloning in E. coli and to modify the properties of metagenomic libraries, to allow genetic 
and functional analyses of particular selected clones in any appropriate system, thereby 
making possible the stable maintenance and propagation, the analysis and/or the 
expression of the huge diversity of metagenomic libraries. According to the invention, 
polynucleotide libraries can be produced in any convenient cloning system, such as E. 
25 coli, and then modified, depending on the desired selection or screening system, to adapt 
host range and/or properties of the library (or a portion thereof). 

A particular object of this invention resides more specifically in a method of analysing a 
library of polynucleotides, said polynucleotides being contained in cloning vectors 
30 having a particular host range, the method comprising (i) selecting cloning vectors in the 
library which contain a polynucleotide having a particular characteristic, (ii) modifying 
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said selected cloning vectors to allow a transfer of said vectors into a selected host cell 
and integration of the polynucleotide contained in said vectors into the genome of the 
selected host cell, and (iii) analysing the polynucleotides contained in said modified 
vectors upon transfer of said modified vectors into said selected host cell. 

5 

An other object of this invention is a library of polynucleotides, wherein said library 
comprises a plurality of environmental DNA fragments cloned into cloning vectors, 
wherein said environmental DNA fragments contain a common molecular characteristic 
and wherein said cloning vectors are E. coli cloning vectors comprising a target 
10 polynucleotide construct allowing (i) transfer of the environmental DNA into a selected 
host cell distinct from E. coli, (ii) integration of the environmental DNA into the genome 
of a selected host cell, and (iii) stable maintenance and propagation of the environmental 
DNA into the selected host cell. 

15 A further object of mis invention is a method of producing modified libraries of 
polynucleotides, the method comprising selecting a sub-population of clones in a first 
library, based on the presence or absence of a characteristic of interest, and modifying 
the properties of said selected clones to allow their functional analysis or expression. 

20 The modification in the library or cloning vector is typically obtained by targeted 
insertion of a polynucleotide construct, preferably using transposable elements, either in 
vitro or in vivo. 

The integration into the genome of the selected host cell is typically obtained by site 
25 specific integration or by homologous or heterologous DNA/DNA recombination. 



The invention is particularly suited for producing and analysing genetic diversity 
(metagenomic libraries), to identify new genes and isolate new metabolites, drugs, 
enzymes, antibiotics, etc. 
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Legend to the Figures 

Figure 1 : Map of pPl vector carrying transposable construct Tn<Apra> fig.la. Excised 
transposable construct is 1 132 bps in size. It contains two mosaic ends (ME) and a gene 
5 conferring resistance to apramycine (Apra) fig. lb. 

Figure 2 : pPLl Vector (fig. 2a) that carries the conjugative transposable construct 
Tn<Apra-oriT> (fig. 2b). The nucleotide sequence of the transposable construct contains 
an origin of transfer (oriT) and a gene conferring resistance to apramycine (Apra). 
1 0 Direction of DNA transfer at oriT is shown by an arrow. 

Figure 3 : Transposable construct Tn<Apra-oriT-att-int> (fig. 3a) on vector pPAOI6 (fig. 
3b). The transposable construct contains OC31 integrase gene and attachment DNA 
sequence for site specific integration, origin of transfer and gene for selection. The 
1 5 orientation of genes and direction of DNA transfer are marked by arrows. 

Figure 4 : This figure shows any target DNA suitable for insertion of transposable 
construct (fig. 4a). Insertion of conjugative and site specific integrative transposable 
construct Tn<Apra-oriT-int> is shown on fig. 4 b and c. Insertion of transposable 
20 construct into selective gene marker carried on original vector is shown on fig. 4b. In this 
event, the cloned insert is intact and can be transferred to heterologous host. Insertion of 
transposable construct into cloned DNA insert results in gene inactivation (fig. 4c). 

Figure 5 : Complete annotated DNA sequences of fosmid clones FS3-124 (a, SEQ ID 
25 NO: 1) and FS3-135 (b, SEQ ID NO: 2). 

Figure 6 : Morphological differences between Streptomyces transconjugant (assay). 
Conjugations have been performed with FS3-124 modified with transposable construct 
pPAOte; (control) conjugation have been performed with pPAOI6. 
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Figure 7: Schematic Map of pPSB vector (top) and transposable elements (bottom). The 
transposable element has 720 bp DNA of the amyE gene from B. subtilis in addition to 
att-int-orH-Apra. 

5 Figure 8: Plasmid pPSBery (top) and transposable element (bottom) carrying selective 
marker ery AM, for resistance to erithromycine, and part of amy E gene for homologus 
recombination in B. subtillis. 

Figure 9: hxtegrase <S> C31 was deleted from pPSBery plasmid. Resulting plasmid is 
10 pPSBery-DI (top). Transposable element contains Apra and ery AM genes for selection, 
oriT origin of transfer and a part of amyE gene for integration in to amyE locus of B. 
subtilis chromosome (bottom). 

Figure 10: Map of pTn5-7 AOI plasmid. Transposable element has ends of tn5 (ME) and 
15 tn7 (T7 R, T7 L) transposons. 

Detailed Description of the Invention 

The invention provides novel strategies, methods and products for generating and 
20 analysing combinatorial gene libraries. As indicated above, the invention discloses, 
particularly, methods of analysing libraries of polynucleotides, said polynucleotides 
being contained in cloning vectors having a particular host range, the methods 
comprising (i) selecting cloning vectors in the library which contain a polynucleotide 
having a particular characteristic, (ii) modifying said selected cloning vectors to allow a 
25 transfer of said vectors and/or expression of the polynucleotide which they contain into a 
selected host cell, and (iii) analysing the polynucleotides contained in said modified 
vectors upon transfer of said modified vectors into said selected host cell, such as by 
genetic, biochemical, chemical or phenotypical approaches. 

30 In a most preferred embodiment, the methods allow stable transfer and propagation of 
large environmental nucleic acids in a selected host following initial selection. Such 
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methods comprise (i) selecting cloning vectors in a library which contain a 
polynucleotide having a particular characteristic, (ii) modifying said selected cloning 
vectors to allow a transfer of said vectors and integration of the polynucleotide which 
they contain into a selected host cell genome, and (iii) analysing the polynucleotides 
5 contained in said modified vectors upon transfer of said modified vectors into said 
selected host cell, such as by genetic, biochemical, chemical or phenotypical approaches. 

T .ihrarv of polynucleotides 

10 The term 'library of polynucleotides" designates a complex composition comprising a 
plurality of polynucleotides, of various origins and structure. Typically, the library 
comprises a plurality of xmknown polynucleotides, i.e., of polynucleotides whose 
sequence and/or source and/or activity is not known or characterized. In addition to such 
unknown (or uncharacterized) polynucleotides, the library may further include known 

15 sequences or polynucleotides. Typically, the library comprises more than 20 distinct 
polynucleotides, more preferably at least 50, typically at least 100, 500 or 1000. The 
complexity of the libraries may vary. In particular, libraries may contain more than 5000, 
10 000 or 100 000 polynucleotides, of various origin, source, size, etc. Furthermore, the 
polynucleotides are generally cloned into cloning vectors, allowing their maintenance 

20 and propagation in suitable host cells, typically in E. coli. The polynucleotides in the 
library may be in the form of a mixture or separated from each other, in all or in part. It 
should be understood that some or each polynucleotide in the library may be present in 
various copy numbers. 

25 The polynucleotides in the libraries are more preferably obtained or cloned from 
complex sources of nucleic acids, most preferably from environmental samples. Such 
libraries are also termed "metagenomic libraries" since they contain nucleic acids 
derived from whole genomes of mixed populations of microorganisms. 

30 The term environmental sample designates, broadly, any sample containing (a plurality 
of) uncharacterized (microorganisms, particularly uncultivated (or non-cultivable) 
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microorganisms. The sample may be obtained or derived from specific organisms, 
natural environments or from artificial or specifically created environments (e.g., 
industrial effluents, etc). An uncultivated (or non-cultivable) microorganism is a 
microorganism that has not been purposely cultured and expanded in isolated form. The 
sample may be obtained or derived from soil, water, mud, vegetal extract, wood, 
biological material, marine or estuarine sediment, industrial effluents, gas, mineral 
extracts, sand, natural excrements, meteorits etc. The sample may be collected from 
various regions or conditions, such as tropical regions, deserts, volcanic regions, forests, 
farms, industrial areas, household, etc. 



Environmental samples usually contain various species of (uncharacterized, 
uncultivated) microorganisms, such as terrestrial microorganisms, marine 
microorganisms, salt water microorganisms, freshwater microorganisms, etc. Species of 
such environmental microorganisms include autotrophe or heterotrophe organisms, 

15 eubacteria, archaebacteria, algae, protozoa, fungi, viruses, phages, parasites, etc. The 
microorganisms may include extremophile organisms, such as thermophiles, 
psychrophiles, psychrotophes, acidophiles, halophiles, etc. More specific examples of 
environmental bacteria includes actinomycetes, eubacteriaes and mycobacteriaes, 
examples of fungi include phycomycetes, ascomycetes and basidiomycetes, etc. Other 

20 organisms include yeasts (saccharomyces, kluyveromyces, etc.) plant cells (algae, 
lichens, etc), corals, etc. for instance. The sample may comprise various species of such 
donor (uncultivated) microorganisms, as well as various amounts thereof. The 
environmental sample may contain, in addition, known and/or cultivable 
microorganisms (e.g., prokaryotic or eukaryotic), as well as nucleic acids and organic 

25 materials. The sample may also contain different animal cells: mammalian cells; insect 
cells, etc (arising from larvae, feces, etc.). 

It should be understood that the present invention is not limited to any specific type of 
sample or environmental microorganism, but can be used to produce diversity, create 
30 nucleic acid libraries, etc., from any environmental sample comprising uncultivated 
microorganisms. The sample may be wet, soluble, dry, in the form of a suspension, 
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i • a~t \r\ cnlid or semi-solid state 
i a Preferablv the sample is dry or in soua ui 
paste, powder, sohd, * ' ^ ^ to nudei c acid 

(e.g., paste, powder, mud, gel, etc.). The sample may 

exllaln, Jr mstat.ee by washings, nitrating, centring, diluting, dryuig, etc 

^ term «— ™ A ^ " "e 

sampl e according to various techniques, - - 
WOO.,40497, in Handelsman « * (Chemist » Biofcgy 5(1 * » 
e, aL Cnweeh «. 1999. 403 ; Applreu and Environm. Mtcrobiol. <*, 
, e, al (Appbeu and Environm. MieroWol. 65, 1999, 47,5) or Frostegard - * (Appbed 
and Environm. MieroWol., 65, 1999, 5409). 

„ a particular embodiment of me above method, the library comprises a ploraU* of 
IhLmenta, DNA fragments. The Ubrary may also comprise olher types of nuclem 
5 acids, such as environmental RNAs,fcr instance. 

20 size, to produce homogenous libraries. 



^nninp; Vectors 



25 



30 



AS indicate* me polynucleotides are contained or cloned into cloning vectors These 
CL^Cbe If various lypes, mOnding plasmids, cosmids, fosmids eptsom. 

chLnosomes, phages, vira, vectors, etc. „ most ^ — ' * 
Coning vec^rs are select from plasmids, coamtds, phages 

BAGS even more preferably torn eosmids, PI derivabves and BACa. By usnrg cosm. 
t nLatives, i, is possible to genera, homogenous libraries, sorce J. . 
Lastly accommodate poiynudeoudes having a size of approximately 40 kb and 80 
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after the initial cloning step, the cloning capacity of the vectors is maximized and inserts 
of as much as 40 and 80 kbs in length can be cloned into fosmids, BAC and PI 
derivatives. 

5 As indicated the cloning vector has a particular host range, i.e., the ability to replicate in 
a particular type of host cell. Typically, the host is a bacteria, more preferably an E. coli 
strain. Indeed, E. coli is so far the most convenient host cell for performing recombinant 
technologies. The advantage of the present invention is that the starting library can be 
produced in any suitable host system of choice, since the properties of the libraries will 

10 be adapted later during the process. 

Cloning vectors generally comprise the polynucleotide insert and genetic elements 
necessary and sufficient for maintenance into a competent host cell. They typically 
contain, in addition to the polynucleotide insert, an origin of replication functional in a 
15 selected host cell as well as a marker gene for selection and screening. The cloning 
vector may comprise additional elements, such as promoter regions, for instance. 
Although cloning vectors may replicate in several different host cells, they are usually 
adapted to a particular host cell type and not suitable or efficient for replication or 
maintenance in other cell types. 

20 

m a preferred embodiment, the cloning vectors of the library are K coli cloning vectors, 
preferably cosmids , BAC or PI derived vectors. E. coli cloning vectors may carry an 
origin of replication derived from naturally-occurring plasmids, such as ColEl, pACYC 
and pl5A, for instance. Many E.coli cloning vectors are commercially available and/or 
25 can be constructed using available regulatory sequences. 

Krrrftftnin p of the cloning vectors 

In step i) of the method, a first selection or screen is performed on the polynucleotide 
30 library. The screen is performed so as to identify or select clones having (or lacking) a 
particular, common characteristic. The selection may be carried out according to various 
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techniques, such as molecular screening, protein expression, functional screening, etc. A 
preferred selection is performed by molecular screening. Molecular screening designates 
any method of identification of molecular or structural characteristics in a polynucleotide 
sequence. This can be made by a variety of techniques which are known per se, such as 
5 hybridisation, amplification, sequencing, etc. Preferably, molecular screening comprises 
the selection of clones in a library which contain, in their sequence, a particular sequence 
or region or motif, said sequence or region or motif being characteristic of a particular 
type of activity or gene (enzyme, biosynthetic pathways, etc.). 

10 In a first variant, the selection is made by contacting the cloning vectors in the library 
with a particular nucleic acid probe (or set of probes) containing a sequence which is 
characteristic of a selected activity or function (a consensus sequence, a particular motif, 
etc.). The cloning vectors in the library which hybridise to the probe (or set of probes) 
are then selected. 

15 

In a second variant, the selection is made by contacting the cloning vectors in the library 
with a particular pair of nucleic acid primers specific for a sequence which is 
characteristic of a selected activity or function (a consensus sequence, a particular motif, 
etc.), and a PCR amplification reaction is performed. The cloning vectors in the library 
20 which lead to a positive amplification product are then selected. 

In this regard, the present application provides new primers designed in conserved 
motives of the p-keto acyl synthase gene, which are particularly useful for screening 
polynucleotides containing putative polyketide synthase (PKS) genes or domains. These 
25 primers have the following degenerated sequence : 

Sense primer : 5'- GGSCCSKCSSTSDCSRTSGAYACSGC -3' (SEQ ID NO: 3) 
Antisense primer : 5'- GCBBSSRYYTCDATSGGRTCSCC -3' (SEQ ID NO: 4) 

wherein: 
RisAorG 
30 S is G or C 
YisCorT 
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KisGorT 

Dis A or G or T, and 

BisCorGorT. 

5 A particular object of this invention is a polynucleotide primer having one of the above 
sequences, typically a mixture of different polynucleotide primers having a sequence 
corresponding to one of the above degenerated sequences. A particular object of this 
invention also resides in a pair of primers each having one of the above sequences. 

10 Once particular clones have been selected, the analysis of their polynucleotides needs to 
be confirmed and/or validated, and/or their polynucleotides can be used to study their 
function and/or produce novel compounds or metabolites. 

The invention now enables such further analysis and uses, by allowing a modification of 
1 5 the cloning vectors that is specific and adaptable by the skilled person, depending on the 
activity which is sought. In particular, it is possible to confer properties such as specific 
expression or a novel, specific host range to the selected cloning vectors, to assess their 
activity, as disclosed below. 



20 Modification of the clon ing vectors 

After high efficiency cloning using most convenient cloning vectors such as BACs or 
cosmids propagated into E. coli, and after the identification, selection and/or 
characterisation of cloned DNA fragments, the invention now allows to modify 
25 specifically the cloning vectors to transfer, integrate into the genome, maintain, express 
and/or over-express the selected polynucleotides into any selected host expression 
system, which is suitable to assess the selected activity or property. Such selected hosts 
may be native or heterologous host cells, and include, but are not limited to, for example 
Streptomyces, Nocardia, Bacillus, fungi, yeasts, etc. 

30 
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The selected cloning vectors of the library may be modified according to various 
techniques. The modification is typically a genetic modification, comprising the 
introduction of particular genetic sequences into the structure of the cloning vector, in 
addition to or in replacement of sequences contained in said vector. It is highly preferred 
5 to use specific or targeted (or oriented) techniques to improve the efficacy of the method. 
By "specific" is meant that the modification occurs at a pre-determined location in the 
cloning vector, through site-specific mechanisms. By "targeted" is meant that the 
modification occurs in a controlled way, so as not to alter the polynucleotide insert 
contained in the vector in a non-desirable way. 

10 

In a preferred embodiment, the selected vectors are modified by insertion, into the 
vector, of a target polynucleotide construct which contains genetic elements conferring 
the selected properties) to the cloning vector. 

1 5 The target polynucleotide construct typically comprises the genetic elements necessary to 
transfer, propagate, integrate into the genome, maintain, express or overexpress the 
cloned polynucleotide into a chosen (bacterial) host expression system. Said genetic 
elements may include particular origin(s) of replication, particular origin(s) of transfer, 
particular integrase(s), transcriptional promoters) or silencers), either alone or in 

20 combination^). 

In a first, preferred variant, the target polynucleotide construct comprises a genetic 
element allowing transfer of the vector into a selected host cell. 

25 Natural DNA transfer mechanisms between donor and recipient strains is known under 
the term conjugation or conjugative transfer. Conjugative transfer can occur between 
different strains of the same species as well as between strains of different species. Many 
naturally occurring plasmids carry so called tra genes, which are involved in and mediate 
conjugative transfer. The DNA transfer starts at specific DNA structures, known as an 

30 origin of transfer or "ori T\ The presence of such an oriT in a vector allows said vector 
to be transferred into a desired host cell. 
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In a particular, preferred embodiment, the target polynucleotide construct comprises an 
origin of transfer functional in the selected host cell. 

The structure of various oriT has been reported in the art (Guiney et ah 1983; Zechner et 
5 al. 2000). In a specific embodiment, the origin of transfer is selected (or derived) from 
RP4, pTiC58, F, RSF1010, ColEl and R6K(a). 



A specific example of an oriT which can be used in the present invention derives from 
plasmid RP4 and has or comprises all or a functional part of the following sequence 
10 (SEQIDNO:5): 



gatctGTGATGTACTTCACCAGCTCCGCGAAGTCGCTCTTCTTGATTGGAGCGCATGGG 

GACGTGCTTGGCAATCACGCGCACCCCCCGGCCGTTTTAGCGGCTAAAAAAGTCAT 

GGCTCTGCCCTCGGGCGGACCACGCCCATCATGACCTTGCCAAGCTCGTCCTGCTTC 

15 TCTTCGATCTTCGCCAGCAGGGCGAGGATCGTGGCATCACCGAACCGCGCCGTGCG 
CGGGTCGTCGGTGAGCCAGAGTTTCAGCAGGCCGCCCAGGCGGCCCAGGTCGCCAT 
TGATGCGGGGCAGCTCGCGGACGTGCTCATAGTCCACGACGCCCGTGATTTTGTAGC 
CCTGGCCGACGGCCAGCAGGTAGGCCGACAGGCTCATGCCGGCCGCCGCCGCCTTT 
TCCTCAATCGCrCTTCGTTCGTCTGGAAGGCAGTACACCTTGATAGGTGGGCTGCCC 

20 TTCCTGGTTGGClTGGTTTCATCAGCCATCCGCTrGCCCTCATCTGTTACGCCGGCGG 
TAGCCGGCCAGCCTCGCAGAGCAGGATTCCCGTTGAGCACCGCCAGGTGCGAATAA 
GGGACAGTGAAGAAGGAACACCCGCTCGCGGGTGGGCCTACTTCACCTATCCTGCC 
CGGCTGACGCCGTTGGATACACCAAGGAAAGTCTACACGAACCCTTTGGCAAAATC 
CTGTATATCGTGCGAAAAAGGATGGATATACCGAAAAAATCGCTATAATGACCCCG 

25 AAGCAGGGTTATGCAGCGGAAAAGATCCGTCGGATCT 



The term "functional part" designates any fragment or variants of the above sequence 
which retain the capacity to cause conjugative transfer. Such fragments typically 
comprise at least 80%, preferably at least 85% or 90% of the above sequence. Variants 
30 may include one or several mutations, substitutions, deletions or additions of one or 
several bases. 



In an other particular variant, the target polynucleotide construct comprises a genetic 
element allowing integration of the vector (or of the polynucleotide contained therein) 
35 into the genome of the selected host cell. 
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A donor DNA is permanently or stably maintained and expressed in a selected recipient 
cell if it is integrated into the recipient cell's genome or if it contains elements that allow 
autonomous replication in said cell. In a most preferred embodiment, the vector is 
modified to allow transfer and integration of the polynucleotide into the host cell 
5 genome. 



Integration is a preferred way of ensuring stable expression. Integration can be obtained 
by physical recombination. Recombination can be homologous, e.g., between two 
homologous DNA sequences, or illegitimate, where recombination occurs between two 

10 non-homologous DNAs. As a particular example, integration of donor DNA into the 
chromosome of the recipient can be mediated by host recombination repair system or by 
site-specific recombination. Another well-studied process that can transfer and integrate 
genes is transduction by bacterial viruses, such as X and <|>C31. In a phage-infected 
bacterial cell, fragments of the host DNA are occasionally packaged into phage particles 

15 and can then be transferred to a recipient cell. Integration into the recipient cell's genome 
is caused by an integrase. 



In a specific embodiment, the target polynucleotide construct comprises a nucleic acid 
encoding an integrase functional in the selected host cell. More preferably, the integrase 
20 is selected from X and tyCll integrases. In a specific embodiment, the polynucleotide 
construct comprises a nucleic acid encoding an integrase having or comprising all or a 
functional part of the following sequence of the <|>C31 integrase (SEQ ID NO: 6) : 



AGATCTCCCGTACTGACGGACACACCGAAGCCCCGGCGGCAACCCTCAGCGGATGC 
25 CCCGGGGCTTCACGTTTTCCCAGGTCAGAAGCGGTTTTCGGGAGTAGTGCCCCAACT 
GGGGTAACCTTTGAGTTCTCTCAGTTGGGGGCGTAGGGTCGCCGACATGACACAAG 
GGGTTGTGACCGGGGTGGACACGTACGCGGGTGCTTACGACCGTCAGTCGCGCGAG 
CGCGAGAATTCGAGCGCAGCAAGCCCAGCGACACAGCGTAGCGCCAACGAAGACA 
AGGCGGCCGACCTTCAGCGCGAAGTCGAGCGCGACGGGGGCCGGTTCAGGTTCGTC 
30 GGGCATTTCAGCGAAGCGCCGGGCACGTCGGCGTTCGGGACGGCGGAGCGCCCGGA 
GTTCGAACGCATCCTGAACGAATGCCGCGCCGGGCGGCTCAACATGATCATTGTCT 
ATGACGTGTCGCGCTTCTCGCGCCTGAAGGTCATGGACGCGATTCCGATTGTCTCGG 
AATTGCTCGCCCTGGGCGTGACGATTGTTTCCACTCAGGAAGGCGTCTTCCGGCAGG 
GAAACGTCATGGACCTGATTCACCTGATTATGCGGCTCGACGCGTCGCACAAAGAA 
3 5 TCTTCGCTGAAGTCGGCGAAGATTCTCG ACACGAAGAACCTTC AGCGCGAATTGGG 
CGGGTACGTCGGCGGGAAGGCGCCTTACGGCTTCGAGCTTGTTTCGGAGACGAAGG 
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AGATCACGCGCAACGGCCGAATGGTCAATGTCGTCATCAACAAGCTTGCGCACTCG 

ACCACTCCCCTTACCGGACCCTTCGAGTTCGAGCCCGACGTAATCCGGTGGTGGTGG 

CGTGAGATCAAGACGCACAAACACCTTCCCTTCAAGCCGGGCAGTCAAGCCGCCAT 

TCACCCGGGCAGCATCACGGGGCTTTGTAAGCGCATGGACGCTGACGCCGTGCCGA 

CCCGGGGCGAGACGATTGGGAAGAAGACCGCTTCAAGCGCCTGGGACCCGGCAACC 

GTTATGCGAATCCTTCGGGACCCGCGTATTGCGGGCTTCGCCGCTGAGGTGATCTAC 

AAGAAGAAGCCGGACGGCACGCCGACCACGAAGATTGAGGGTTACCGCATTCAGCG 

CGACCCGATCACGCTCCGGCCGGTCGAGCTTGATTGCGGACCGATCATCGAGCCCG 

CTGAGTGGTATGAGCTTCAGGCGTGGTTGGACGGCAGGGGGCGCGGCAAGGGGCTT 

TCCCGGGGGCAAGCCATTCTGTCCGCCATGGACAAGCTGTACTGCGAGTGTGGCGC 

CGTCATGACTTCGAAGCGCGGGGAAGAATCGATCAAGGACTCTTACCGCTGCCGTC 

GCCGGAAGGTGGTCGACCCGTCCGCACCTGGGCAGCACGAAGGCACGTGCAACGTC 

AGCATGGCGGCACTCGACAAGTTCGTTGCGGAACGCATCTTCAACAAGATCAGGCA 

CGCCGAAGGCGACGAAGAGACGTTGGCGCTTCTGTGGGAAGCCGCCCGACGCTTCG 

GCAAGCTCACTGAGGCGCCTGAGAAGAGCGGCGAACGGGCGAACCTTGTTGCGGAG 

CGCGCCGACGCCCTGAACGCCCTTGAAGAGCTGTACGAAGACCGCGCGGCAGGCGC 

GTACGACGGACCCGTTGGCAGGAAGCACTTCCGGAAGCAACAGGCAGCGCTGACGC 

TCCGGCAGCAAGGGGCGGAAGAGCGGCTTGCCGAACTTGAAGCCGCCGAAGCCCCG 

AAGCTTCCCCTTGACCAATGGTTCCCCGAAGACGCCGACGCTGACCCGACCGGCCCT 

AAGTCGTGGTGGGGGCGCGCGTCAGTAGACGACAAGCGCGTGTTCGTCGGGCTCTT 

CGTAGACAAGATCGTTGTCACGAAGTCGACTACGGGCAGGGGGCAGGGAACGCCCA 

TCGAGAAGCGCGCTTCGATCACGTGGGCGAAGCCGCCGACCGACGACGACGAAGAC 

GACGCCCAGGACGGCACGGAAGACGTAGCGGCGTAGCGAGACACCCG 



The term "functional part" designates any fragment or variants of the above sequence 
which retain the capacity to cause integration. Such fragments typically comprise at least 
80%, preferably at least 85% or 90% of the above sequence. Variants may include one or 
several mutations, substitutions, deletions or additions of one or several bases. 



In a more preferred variant, the target polynucleotide construct comprises genetic 
elements allowing transfer of the cloning vector into the selected host cell and 
integration of the cloning vector or a portion thereof into the genome of the selected host 
cell. Most preferred polynucleotide constructs comprises an oriT and a nucleic acid 
encoding an integrase. 



In an other variant, the target polynucleotide construct comprises an origin of replication 
specific for or functional in the selected host cell. The origin of replication may be 
selected (or derived), for instance, from pAMpl, pSa, 2um circle, pSam2, pSGl, pIJlOl, 
SCP2, pA387 and artificial chromosomes. 
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In an other variant, the target polynucleotide construct comprises a transcriptional 
promoter functional in the selected host cell. As indicated above, in a particular variant, 
the invention allows to modify the cloning vector to enable expression or over- 
expression of the cloned polynucleotides in the selected host. The expression of genes is 

5 driven mainly by transcriptional promoters, which initiate gene transcription. The type of 
promoter to be used in the present invention can be selected by the skilled person, 
depending on the selected host cell and type of expression needed. Promoters may be 
ubiquitous or cell-specific, regulated or constitutive, weak or strong. They may be of 
various origins, including promoters isolated from viruses, phages, plant cells, bacterial 

10 genes, mammalian genes, etc., or they may be artificial or chimeric. Typical examples of 
promoters include T7, T4, LacZ, trp, ara, SV40, tac, APL, GAL, AOX, hsp-70, etc. 

The target polynucleotide construct is typically a DNA molecule, although RNAs may 
also be used as starting material. It is typically a double-stranded DNA. The target 
15 polynucleotide construct may be produced by conventional recombinant DNA 
techniques, including DNA synthesis, cloning, ligation, restriction digestion, etc. and a 
combination thereof. 

The target polynucleotide construct is preferably engineered so as to be inserted in a 
20 region of the vector distinct from the polynucleotide. Indeed, it is important that the 
integrity of the polynucleotides is preserved. Directed insertion may be accomplished in 
a variety of ways, including site-specific insertion using particular enzymatic systems 
(Cre/Lox, FLP, etc.), homologous recombination with particular target sequences present 
in the vector, or by the use of transposons or transposable elements and appropriate 
25 selection means. 

In a particular, preferred embodiment, the target polynucleotide construct is contained in 
or comprises a transposable nucleic acid construct. Indeed, in a preferred variant, the 
methods of the present invention use transposable elements to alter the properties of the 
30 cloning vectors, and allow their transfer, maintenance, expression or over-expression in a 
selected host cell. 
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Transposable nucleic acid constructs are derived from transposons, which are genetic 
elements capable of moving from one genetic loci to another. Two main classes of 
transposons have been identified in bacteria. The most simple transposons comprise an 

5 insertion sequence that carries only elements of transposition. These elements are two 
inverted DNA repeats and a gene that codes for a protein called transposase. The 
transposase catalyses the excision and integration of the transposon. It has been shown 
that the excision and integration reaction can be catalysed in trans by a transposase, 
which can be provided in vivo or in vitro in purified form or expressed from a different 

10 construct. More complex transposons carry more insertion sequences and additional 
genes mat are not involved in transposition. 

Transposable nucleic acid constructs of this invention thus typically comprise, flanked 
by two inverted repeats, the target polynucleotide construct and, more preferably, a 

1 5 marker gene. In the presence of a transposase, these transposable nucleic acid constructs 
can integrate into a cloning vector in vivo or in vitro, thereby providing for targeted 
polynucleotide insertion. Alternatively, such nucleic acid constructs can be used for 
targeted integration, in the absence of a transposase, in particular strains such as 
hypermutator strains. Such transposable nucleic acid constructs also represent a 

20 particular object of the present application. In this regard, in a more preferred 
embodiment, the invention also relates to a transposable nucleic acid construct, wherein 
said construct comprises an origin of transfer flanked by two inverted repeats. Specific 
examples of such construct are transposons pPLl and pPAOI6, as disclosed in the 
experimental section. The transposable nucleic acid construct may further comprise an 

25 integrase gene and/or a marker gene. 

The inverted repeat nucleic acid sequences may be derived from the sequence of various 
transposons, or artificially created. In particular, transposable elements can be generated 
using inverted repeats obtained from transposons or transposable elements such as Tn5, 
30 Tn21, miniTn5, T7, T10, Tn917, miniTn400, etc. Preferably, the sequences derive from 
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transposon Tn5. In a specific embodiment, they comprise all or a functional part of the 
following sequences : 

- left arm of pPAOI6 transposon (SEQ ID NO: 7) 

5 CTGTCTCTTATACACATCTCAACCATCATCGATGAATTTTCTCGGGTGTTCTCGCATA 

TTGGCTCGAATTCGAGCTCGGTACCC 

- right arm of transposon pPAOK (SEQ ID NO: 8) 

GATCCTCTAGAGTCGACCTGCAGGCATGCAAGCTTGCCAACGACTACGCACTAGCC 
10 AACAAGAGCTTCAGGGTTGAGATGTGTATAAGAGACAG 

The marker gene may be any nucleic acid encoding a molecule whose presence in a cell 
can be detected or visualized. Typical marker genes encode proteins conferring 
resistance to antibiotics, such as apramycine, chloramphenicol, ampiciline, kanamycine, 
1 5 spectinomycine, thiostrepton, etc. Other types of markers confer auxotrophy or produce a 
label (e.g., galactosidase, GFP, luciferase, etc). 

In a specific embodiment, the cloning vector in the library comprises a first marker gene 
and the modification step ii) comprises: 
20 . contacting in vitro, in the presence of a transposase, the selected cloning vectors with a 
transposon comprising, flanked by two inverted repeats, the target polynucleotide 
construct and a second marker gene distinct from the first marker gene, and 
. selecting the cloning vectors which have acquired the second marker gene and which 
have lost the first marker gene. 

25 

The double selection ensures that the target polynucleotide construct has been inserted at 
a site within the marker gene present in the cloning vector, i.e„ outside of the 
polynucleotide insert. 

30 It should be understood that the modification may be accomplished in various other 
ways, particularly by incorporating a sequence coding for the transposase directly into 
the transposable element or into another expression unit. The presence and expression of 
the transposase can be regulated by inductive promoter or termosensitive replicative 
units. Also, transposition can be carried out by in vitro process. 
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Analysis of the polynucleotides 

In step (Hi) of the process, the polynucleotides may be analysed by various methods, 
5 including by genetic, biochemical, chemical or phenotypical approaches, which are well 
known per se in the art Analysis occurs upon transfer and, optionally, expression of the 
polynucleotides into the selected host cell. 

In this regard, the modified cloning vectors can be transferred into the selected host cell 
10 by a variety of techniques known in the art, including by transformation, electroporation, 
transfection, protoplast fusion, conjugative transfer, etc. In a preferred embodiment, the 
target polynucleotide construct comprises an oriT and the modified vectors are 
transferred into the selected host cells by conjugative transfer. In this embodiment, the 
cloning vector and the selected host cells are co-cultivated and the recombinant host cells 
15 are selected and isolated. 

The selected host cell may be any type of cell or microorganism, including, without 
limitation, Steptomyces, E. coli, Salmonella, Bacillus, Yeast, fungi, etc. 

20 One of the objectives of the invention is to be able to analyse environmental DNAs of 
unknown cellular origin into different host expression systems. In order to analyse the 
potentiality of the DNAs at the transcription and/or translation levels and to have much 
more probabilities to have a DNA expression, it is important to have the possibility to 
test different host expression systems. 

25 

Insertion of foreign DNA into host expression systems like Streptomyces can produce, 
for instance, an increase in doubling time, morphological modifications, pigments 
production, etc., which can be related either directly to the expression of foreign DNA or 
by combinatorial biology of the foreign DNA and the biology of the expression host 
30 systems. The new phenotypes can be analysed by all techniques known in the art such 
genetic, biochemical, chemical, phenotypic approaches etc. 
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In a preferred, specific embodiment, the invention relates to a method for the 
identification or cloning of polynucleotides encoding a selected phenotype, the method 
comprising (i) cloning environmental DNA fragments into E. coli cloning vectors to 

5 produce a metagenomic library, (ii) identifying or selecting cloning vectors in said 
library which contain DNA fragments having a particular characteristic of interest, (iii) 
momfying the identified or selected cloning vectors into shuttle or expression vectors for 
transfer and integration in a selected host cell, (iv) transferring the modified cloning 
vectors into said selected host cell and (v) identifying or cloning the DNA fragments 

1 0 contained in said modified cloning vectors which encode said selected phenotype in said 
selected host cell. 

By applying the above method, new polynucleotide sequences have been identified, 
cloned and characterized, which produce new phenotypes in bacteria. These 

15 polynucleotides contain the sequence of PKS genes and other genes that encode 
polypeptides involved in biosynthetic pathways. The sequence of these polynucleotides 
is provided in Figure 5. The invention also relates to any polynucleotide sequence 
comprising all or part of these sequences (i.e., SEQ ID NOs: 1 or 2), their 
complementary strand, or a functional variant thereof. A part of the above sequences 

20 includes, preferably, at least 20 consecutive bases, more preferably at least 50 
consecutive bases thereof, even more preferably a coding sequence (e.g., an CDS). In 
this respect, SEQ ID NOs: 1 and 2 comprise several novel open reading frames encoding 
novel polypeptides involved in biosynthetic pathways. These coding sequences are 
identified in Figures 5a and 5b. 

25 

In a specific embodiment, the invention relates to a polynucleotide sequence comprising 
a sequence selected from nucleotides (CDS) 76 - 1134 ; 1096 - 2430 ; 1178 - 1624 ; 
2506 - 3567 ; 2906 - 4222 ; 4092 - 5321 ; 6337 - 8502 ; 8181 - 9530 ; 9531 - 10721 ; 
10504 - 11274 ; 12874 - 13689 ; 14195 - 15976 ; 15427 - 16512 ; 15579 - 16253 ; 16505 
30 - 17656 ; 17657 - 18697 ; 18615 - 19304 ; 19301 - 20596 ; 20535 - 21476 ; 22025 - 
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22951 ; 23155 - 26523 ; 26409 - 34433 ; 34418 - 37500 and 35359 - 37500 of SEQ ID 
NO: 1 (figure 5a) or a complementary strand thereof. 

In an other specific embodiment, the invention relates to a polynucleotide sequence 
5 comprising a sequence selected from nucleotides 3 - 914, 924 - 2168 ; 2207 - 3190 ; 
3373 - 4455 ; 4546 - 4959 ; 5176 - 6192 ; 6331 - 14043 ; 14275 - 15408 ; 15436 - 16245 
; 16287 - 17384 ; 17427 - 18158 ; 18248 - 18847 ; 18952 - 20346 ; 20442 - 21167 ; 
21164 - 24301 ; 24351 - 27023 ; 27806 - 29686 ; 29535 - 30872 ; 30848 - 32647 ; 32574 
- 35555 ; 35533 - 36598 and 36516 - 37400 of SEQ ID NO: 2 (figure 5b) or a 
1 0 complementary strand thereof. 

Variants of these sequences include any naturally-occurring variant comprising or or 
several nucleotide substitutions ; sequences variants resulting from the degeneracy of the 
genetic code, as well as synthetic variants coding for functional polypeptides. Variants 

15 include any sequence that hybridise under high stringent conditions, as disclosed for 
instance in Sambrook et al., to any of the above sequences, and encode a functional 
polypeptide. The invention also include any nucleic acid molecule encoding a 
polypeptide comprising all or a fragment of an amino acid sequence encoded by a 
polynucleotide as disclosed above. Preferably, the fragment comprises at least 10 

20 consecutive amino acid residues, more preferably at least 20, even more preferably at 
least 30. 

The invention also relates to any vector comprising these polynucleotide sequences. 
These sequences may be DNA or RNA, preferably DNA, even more preferably double- 

25 stranded DNA. The invention also relates to a polypeptide encoded by a polynucleotide 
sequence as defined above. The invention also relates to a method of producing such 
polypeptides by recombinant techniques, comprising expressing a polynucleotide as 
defined above in any suitable host cell and recovering the encoded polypeptide. The 
invention also relates to a recombinant host cell comprising a polynucleotide or a vector 

30 as defined above. 
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An other object of this invention resides in a Ubrary of polynucleotides wheren, satd 
Hbttny comprises a ptaanty of envnonmental DNA fragments oloned tnto c onurg 
vectors, wherein sard environment DNA fragments contour a connnon molecular 
characteristic and wherein said cloning vectors are E. col, cloning vectors compnsntg a 
5 targe, polynucleotide constioc. allowing transfer and integration of the envnonnrental 
DNA into (he genome of a selected host cell distinct from E. col: 

^ sub-DNA-Ubraries should have either desired genetic characteristics based on high 
or low QC content, DNA encoded for a desired enzymatic activity, part or fall 
10 biosynthetic paurways for metobotites etc or specific origin such as soti fraction, 
annua, organs, sub fraction of a microorganism community etc. The mvention also 
anows to produce conjugative vector with desired characteristics in accordance ^ *e 
characteristics of me pre-identified sub-DNA-libraries and fanctional analyars of mutoms 
fa heterologous hosts. » can aiso be used, wuhou, limitation, for tire production o 
,5 mutants bymutagenesis, for DNA sequencing, genes or biosynthetic patirway, knock-out 
by insertion or to confer transfer capabilities for expression, co-expressron, over- 
expression or modification of biosynthetic pathways. 

pnnher aspect and advantages of the invention will be disclosed in me touovring 
20 examples, which should be regarded as muatiative and no, limiting the scope of tins 
application. 

PvpRnmental Section 

25 A - From E. ">" Streptomvces 

to tiris work, we constructed a fosmid Ubrary in K col, from tota, DNA prepared directly 
from sou. The Ubrary has been screened for presence of biosynthetic pamways. We 
develop* genetic tools for functional genomics that afrow gene notification, 
30 inactivation and horizontal gene transfer from E. col, to a*«*~- 
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The cloning vectors in the library contain ColEl replicon for propagation in E. coli. 
Transposable elements based on Tn 5 transposon were produced and used for in vitro 
modification of selected cloning vectors. Integrated transposable elements contain gene 
for resistance to apramicyne. Conjugative derivatives were constructed by incorporating 
5 origin of transfer from plasmid RP4. A conjugative and site specific integrative 
transposon was also constructed comprising the integrase gene from <j> C31 phage, 
including attP attachement site. Conjugal transfer was demonstrated from an appropriate 
E. coli donor cell to another E. coli or Streptomyces lividans recipient cell. Constructed 
transposon was tested for inactivation of the genes cloned into fosmids. Obtained 
10 mutants can be used for direct sequencing by adequate primers and transferred by 
conjugation into Streptomyces lividans. Transposable elements thus represent very useful 
tools for functional analysis of a large DNA libraries cloned into BAC, PAC fosmids or 
other cloning vectors in which cloned inserts must be transferred into heterologous host 

Materials and methods 

1 5 Bacterial strains, plasmids and growth conditions. 

E. coli DH10B (F- mcrA delta(mrr-hsdRMS-mcrBC) phi80dlacZ deltaM15 delta lacX74 
deoR recAl endAl araD139 delta (ara, leu)7697 galU galK lambda- rpsL nupG), strain 
(Epicentre) was used for fosmid and plasmid transformation and DNA amplification. 
Unless specifically described, all DNA manipulations were performed according to 

20 Sambrook, J., et al, (1989). 

Soil DNA extraction and DNA libraries construction. 

Total bacterial community DNA was extracted and large DNA libraries have been 
constructed into fosmids according to the method described in WO 01/81357. 

Fosmid DNA extraction and Purification 

25 

Fosmids DNA containing soil librairy were extracted from pools of 96 clones. Culture of 
recombinant clones were performed in Deep-Well 96 and 48, respectively in 1 ml and 2 
ml of LB media containing 12,5 ug ml* 1 of chloramphenicol. Cultures were grown at 
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37°C with shaking at 250 RPM during 22 hours. DNA extraction was done by using the 
Nucleobond PC100 extraction kit (Macherey Nagel). 

PCR screening for the detection of PKS genes 

5 Primers design 

Degenerate PCR primers sets were designed to specifically amplify PKS nucleic acids 
sequences. Multiple sequence alignment of PKS domains revealed highly conserved 
motives, in particular in the p-keto acyl synthase domain. Primers LiblF and Lib2R 
were designed in conserved motives of the p-keto acyl synthase gene. LiblF (sense 

10 primer, 5'- GGSCCSKCSSTSDCSRTSGAYACSGC -3') and Lib2R (antisense primer, 
5'- GCBBSSRYYTCDATSGGRTCSCC -3') were deduced from p-keto acyl synthase 
peptide sequences GP(AS)(LV)(AST)(IV)DTAC and GDPIE(TVA)(RAQ)A, 
respectively. The specific fragment amplified with LiblF / Lib2R was approximately 
about 465 bp (corresponding to 155 amino acids). Specificity and efficiency of the PCR 

15 systems were validated by testing on positive DNA controls (i.e. genomic DNA from 
type I PKS producing strain such as Bacillus subtilis, Streptomyces lividans, 
Streptomyces ambofaciens and Ralstonia solanocearurri) and negative DNA controls 
(genomic DNA from strains which are known to do not contain PKS genes). 
Furthermore, DNA extracted from soil samples were tested to calibrate PCR techniques. 

20 

PCR conditions 

PCR conditions were optimised, in particular for concentrations of DMSO, MgCfe, 
Primers and DNA template quantities. For PCR using microorganism genomic DNA and 
soil DNA as template (50 to 200 ng), the PCR mix (50ul) contained 250uM of dNTP, 5 

25 mM MgCl 2 final, 2,5% DMSO, IX PCR buffer, 0,75 uM of each primer and 2,5 U of 
Taq DNA polymerase (Sigma) and sterile distilled water. For PCR using fosmid pooled 
DNA as template (100 to 500 ng), the PCR mix (50ul) contained 250uM of dNTP, 5 
mM MgCl 2 final, 5% DMSO, IX PCR buffer, 0,75 uM of each primer and 2,5 U of Taq 
DNA polymerase (Sigma) and sterile distilled water. For identification of positives 

30 clones in 96 microtiter plates, 25 ul of each bacteria culture were used as template and 
PCR conditions were the same as above. Thermocycling program was : a denaturation 
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step at 96°C for 5 minutes; then 1 minute at 96°C, 65°C for 1 minute, 72°C for 1 minute. 
The first 7 cycles, the annealing temperature was lowered 1°C per cycle until 58°C was 
reached. A subsequent 40 cycles were carried out with the annealing temperature at 
58°C. A final extension step was at 72°C for 7 minutes. For identification of positives 
5 clones in 96 microtiter plates, the first denaturation step of 96°C was during 8 minutes. 
The other steps were the same as described above. PCR reactions were performed with a 
PTC 200 thermocycler (MJ Research). 

PCR products analysis 

10 PCR products of about 465 bp were purified on agarose gel with gel extraction Kit 
(Qiagen) according to the manufacturer recommendations. First approach consisted in 
subcloning PCR products using the Topo PCR II kit (Invitrogen). Recombinant Plasmids 
were extracted using QIAprep plasmid extraction Kit (Qiagen) and sequenced with 
Forward and Reverse Ml 3 primers with CEQ 2000 automated sequencer (Beckman 

15 Coulter). Second approach consisted in direct sequencing of PCR products. Sequencing 
data were compared with nucleic and proteic geribank database using BLAST program. 

Sequencing of the identified fosmid insert DNA and sequence analysis 

Fosmids inserts were sequenced using either a transposon-mediated and by shotgun 
20 subcloning approach. Transposition was realized by using (Transposition Kit) 
commercialized by Epicentre according to the manufacturer. For shotgun subcloning, 
transformants were grown for 16 hours at 37°C. Fosmid extraction was done by using 
the Nucleobond PC100 extraction kit (Macherey Nagel). DNA was partially restricted 
with Sau3A and sized on standard gel elecrtphoresis for fragments ranging from 1 to 3 
25 Kbs and cloned into Bluescript vector according to Sambrook et al. (1 989). 

Sequence analysis was performed with the identification of ORFs by using Frameplot of 
the GC3. Each identified ORF was compared to gene databases by using BLAST 
program. PKS domains were determined by aligning obtained sequence versus already 
described PKS domains from domain databases. 
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Sequencing 

Sequencing reactions were performed with 1 ug of DNA and 3.6 pmol of primer, using 
CEQ 2000 Dye Terminator Cycle Sequencing kit (Beckman Coulter) under conditions 
proposed by supplier. Ten uL of reaction products were precipitated using 4 uL of 
5 solution containing 1.5 M NaOAc, 50 mM EDTA and 60 uL cold 95% ethanol/dH 2 0 
from -20 °C. The pellet was washed 2 times with 200 uL 70% ethanol/dH 2 0, vacuum 
dried and dissolved in 40 uL sample loading solution (supplied in kit). Sequencing 
reactions were run on an CEQ 2000 sequencer (Beckman Coulter). 



10 Plasmids construction and validation 

Plasmid pPl was constructed as follows. A 941 bp DNA fragment containing native 
promoter region and AA(3)IV gene was amplified by polymerase chain reaction (PCR) 
using primers AmF ( d-CCCTAAGATCTGGTTCATGTGCAGCTCCATC, SEQ ID 
NO: 9) and AmR ( d-TAGTACCCGGGGATCCAACGTCATCTCGTTCTCC, SEQ ID 

15 NO: 10). One hundred microlitre reaction were performed containing 0.1 uM each of 
primers, 1 X Vent DNA polymerase buffer (NEB), 0.2 uM of each deoxyribonucleoside 
triphosphate (dNTP), 50 ng of the DNA template and 2U of Vent DNA polymerase 
(NEB). PCR mixture was heated for 4 min at 94 °C in a PTC-200 thermocycler (Peltier) 
and cycled 25 x at 94 °C for 60 sec, 59 °C for 30 sec and at 72 °C for 70 sec. The final 

20 extension was performed at 72 °C for 7 min. 

PCR product was purified using GFX DNA purification kit (Amersham), then digested 
by Bgl II and Sma I restriction enzymes. A 941 bp BglH/Smal fragment was inserted into 
the Bam HI, Sma I sites of pMOD plasmid (EPICENTRE). DH10B E. coli was 
transformed with pPl and subjected to apramicyne selection on LB agar plates. Six 

25 colonies surviving on apramicyne selection were grown in liquid LB media and final 
pPLl candidates were thoroughly checked via PCR and restriction mapping. 

To construct conjugative plasmid pPLl, 750 bp oriT DNA region from plasmid RP4 
were amplified via PCR using primers oriTF (d- 
30 GCGGTAGATCTGTGATGTACTTCACCAGCTCC, SEQ ID NO: 11) and oriTR 
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(TAGTACCCGGGGATCCGACGGATCTTTTCCGCTGCAT, SEQ ID NO: 12). PCR 
conditions were as above. Amplified DNA was digested using Bgl H and Smal restriction 
enzymes. BglWSmal DNA fragment was subjected to purifiation after gel electophoresis 
on 0.7% agarose. Purified fragment was ligated into pPl plasmid digested by Bam HI 
5 and Sma I restriction enzymes. 

<|> C31 integrase gene and attachment site (attP) was amplified via PCR using primers 
Fint (d-AACAAAGATCTCCCGTACTGACGGACACACCG, SEQ ID NO: 13) and PJ 
(d-CGGGTGTCTCGCATCGCCGCT, SEQ ID NO: 14). Amplified DNA fragment was 

10 purified by GFX kit (Amersham) and phosphorilated using T4 polynucleotide kinase 
(NEB) under conditions recommended by the enzyme manufacturer. Phosphorilated 
DNA fragment was cloned in to pPLl vector opened with Smal restriction enzyme 
(NEB) and dephosphorilated by calf alkaline phosphatase (NEB). DH10B E coli was 
transformed with ligation mixture using Bio Rad Pulsing apparatus and protocols 

15 provided by Bio-Rad. Twelve transformants were analyzed by PCR for the presence of 
integrase gene. Orientation of integrase gene was verified by restriction analysis using 
Bgl II and EcoRI restriction enzymes. Resulting plasmid was named pPAOI6. 
To construct pPAOI6-A plasmid, pPAOI6 plasmid was digested with EcoKl and BglR 
restriction enzymes followed bydigestion by Bean mung nuclease (NEB). Linearised 

20 plasmid was self ligated and transformed in to DH1 0B cells. 

Plasmid preparation of fosmid DNA 

Fosmid and BAC DNA for sequencing was prepared by using the Nucleobond AX kit 
(Macherey-Nagel), following protocol for BACs, Cosmid as specified by manufacturer. 

Mutagenesis 

25 Transposon Tn-pPAOI6 was prepared by digestion of pPAOte plasmid using PvuU 
restriction enzymes, followed separation on agarose gel and purification of fragment 
containing transposon from gel using Qiagen kit. The same molar ratio of transposon and 
corresponding fosmid was used for mutagenesis in vitro using Tn5 transposase 
(Epicentre) and conditions specified by manufacturer. We transformed aliquot of the 

30 transposed mixture by electroporation into competent DH1 0B E. coli strain. 
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Conjugation E. coli- Streptomyces lividans 1X24 

Conjugation experiments were done using 6 x 10 6 E coli SI 7.1 cells containing 
conjugative plasmids or fosmids. The E coli cells were grown in LB media with adequat 
antibiotic. The cells were collected by centrifugation, washed two times using same 

5 volume of LB media and concentrated to 10 8 cells/ml and overlaid on LB plates 
containing 2x 10 6 pregeminated Streptomyces lividans TK24 spores. The cells mixture 
were grown over night at 30° C and E. coli cells were washed three times using 2 ml of 
LB media. The plates were overlayed using top agar containing NAL (nalidixic acid) and 
the appropriated antibiotic. Plates were incubated for 4 days at 30°C and transformant 

10 streptomyces colonies were isolated on HT medium (Pridham et ah 1957) containing 
NAL and the same appropriated antibiotic. 

Results 

Construction of a Transposon Tn <Apra>. 
15 E. coli aminoglycoside -(3)- acetyl transferase IV gene (aa(3)IV) was amplified by PCR 
and cloned in pMOD vector (Epicentre) . Advantage of this selective marker allow 
positive selection in E. coli and in Streptomyces lividans. Transposon can be used for 
insertional inactivation in vitro using purified transposase Tn5 (Epicentre). The structure, 
of the pPl constructed vector and transposon was shown on Figure la,lb. 

20 Construction of a conjugative plasmid-transposon. 

Conjugative vector, transposon was constructed by cloning origin of transfer from 
plasmid RP4 into pPl vector producing pPLl vector (Figure 2a, 2b). The origin of 
transfer was cloned in such orientation that the selective aa(3)IV gene is the last 
transferred during conjugation. PPLl vector was introduced in to specific E. coli SI 7.1 

25 strain that carry RP4 plasmid integrated in to chromosome. In conjugation experiment 
between donor strain SI 7.1 carrying pPLl plasmid and DH10 E coli receptor we 
obtained DH10B strain carrying pPLl plasmid. This data shows that cloned oriT 
fragment is functional in the pPLl plasmid. Plasmid pPLl can be used for DNA cloning, 
gene inactivation by homologous recombination. Cloned genes or part of the gene cloned 
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could be then transferred by conjugation in to another host Another advantage of this 
vector is conjugative transposon that can be excised from vector and inserted randomly 
in vivo in to another DNA molecule by purified Tn 5 transposase. 

Construction of a conjugative, site specific integrative plasmid- transposon for horizontal 

5 gene transfer between E. coli and Streptomyces strains. 

PPL1 plasmid was used to clone an integrase gene from phage <j>C31, resulting a plasmid 
pPAOI6 (Figure 3 a, 3b). We tested several clones for horizontal transfer between E. coli 
SI 7.1 strain and Streptomyces lividans TK24 strain. The best transfer was obtained for 
plasmid pPAOI6 were orientation of integrase gene is in opposite orientation to the gene 

10 for resistance to apramycine. Conjugative transfer of pPAOI6 gene in to S. lividans strain 
is confirmed by resistance to apramycine or G418. Additional confirmation of transfer 
was obtained using PCR method. We were able to amplify 2 kb insert using specific 
primers for the <|>C31 integrase gene, and no PCR amplification was obtained for control 
S. lividans TK24 strain. 

15 pPAOI6 transposon was cloned into the EcoRV site of plasmid pGPS3 (New England 
Biolabs). This construction allows transposition not only by transposase Tn5 but also 
using Transposase ABC (New England Biolabs). Resulting plasmid pTn5-7AOI is 
shown on Figure 10. 

The goal of these constructions was to produce transposons that is further used in 
20 functional analysis of the metagenomic DNA library from the soil that were constructed 
in a laboratory (Figures 4a, 4b, 4c). 

Functional analysis of the metagenomic DNA library from soil. 

The fosmids library consists of 120 512 clones, containing ~ 40 kb inserts of soil DNA. 
The library contains approximately 4.8 Gbps of the DNA cloned from soil. Ten percents 
25 of the library was screened by using a PCR approach for the presence of the genes 
involved in production of secondary metabolites (PKS). Using gene-module specific set 
of primers we were able to identified positive clones organized in microtiter plates (96 
wells). Sequences (based on PCR products) obtained from fifteen randomly positive 
clones indicate that the DNA library contains very little sequence redundancy limited to 
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one and that the sequences were found to be new and very diverse in comparison to gene 
databases (data not schown). 

Two fosmids DNA was prepared from two positive clones (FS3-124 and FS3-135) and 
analyzed by sequencing. DNA analysis in silico shows high G+C contents of 72% and 

5 69% respectively of the cloned inserts and presence of cluster genes that could be 
involved in biosynthesis of secondary metabolites (Fig 5 a and b). No specific 
phenotype was observed for the two clones in E colL We employed pPAOI6 transposon 
mutagenesis to produce conjugative mutants. Transposon mutants FS3-124::pPAOI6 and 
FS3-135::pPAOI6 were isolated using apramycine as selective antibiotic. Obtained 

10 mutants were then tested on LB pates containing chloramphenicol. About 1% of the 
tested clones are chloramphenicol sensitive. These clones contain transposon inserted 
into locus encoding chloramphenicol resistance gene and not into cloned DNA insert 
ApraR and ChloS transposon mutants are then used for horizontal gene transfer into 
Streptomyces lividans TK24 strain. Fosmid DNA was prepared from mutants and 

15 transformed into E. coli S17.1 strain. Horizontal gene transfer between E. coli S17.1 and 
Streptomyces lividans was done due to inserted pP AOI6 transposon. Transconjugants of 
the Streptomyces lividans were tested by PCR to confirm gene transfer and integration of 
the conjugative fosmid in to S. lividans chromosome. Both transconjugants showed an 
increase in doubling time, morphological modifications and pigments production in 

20 comparison to the control (Figure 6). 

B - From E.coli to Bacillus subtiUs 

Plasmid pPSB was constructed as follows : A part of amyE gene from B subtilis was 
25 amplified by PCR using primers amyE-BamHI : atcgcaggatcctgaggactctcgaacccg (SEQ 
ID NO: 15) and amyE-EcoRI : cgactgaattcagatctagcgtgtaaattccgtctgc (SEQ ID NO: 16). 
DNA fragment was digested by EcdKL and BarriHI restriction enzymes and ligated into 
EcoKL, BgttL site of pPAOI6 plasmid. Transposable element contains Tn<amyE-int 
<|>C3 l-oriT-apra> is shown on figure 7 with plasmid pPSB. 
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Plasmid pPSBery (Figure 8) was constructed by cloning erm AM gene into pPSB 
plasmid. A 1140 bp Saul A DNA fragment containing ery AM gene with his own 
promoter was cloned from plasmid pMUTIN (Wagner et al 1998) into the Bam HI site 
of plasmid pPSB. Orientation of ery gene was confirmed by sequencing. Transposable 
5 element contains Tn < amyE-int <j)C3 l-ery-oriT-apra> (Figure 8). 

Plasmid pPSBery-AI was obtained after Sma I digestion and self-ligation of core plasmid 
(Figure 9). In this construction, <j)C31 integrase gene was deleted from transposable 
element New transposon is Tn<amyE-ery-oriT-apra> (Figure 9). All transposons could 
10 be released as linear DNA by PvuII digestion from plasmids mentioned above and 
transposed by transposase Tn 5 (Epicentre) in vitro. 

The selection of transposed elements was done using 100 j-ig/ml erythromycine or 
40|j,g/ml apramycine in E coli or 0.3 ng/ml erythromycine in B subtilis. DNA was 

15 transformed by electro-transformation into electrocompetent E coli strains or by 
competence into B. subtilis. Integration of imported DNA into amy E locus of B. subtilis 
chromosome was confirmed using pPSBery-AI plasmid. Integration was confirmed by 
PGR using plasmid-specific and amyE locus-specific primers. Fifteen eryR B. subtilis 
clones were tested by PCR. All transformants showed integration at amyE locus of B 

20 subtilis chromosome, confirming the functionality of the method and constructs. 
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CLAIMS 

1. A method of analysing a library of polynucleotides, said polynucleotides being 
contained in cloning vectors having a particular host range, the method comprising (i) 

5 selecting cloning vectors in the library which contain a polynucleotide having a 
particular characteristic, (ii) modifying said selected cloning vectors to allow a transfer 
and integration of said vectors and/or of the polynucleotide which they contain into a 
selected host cell, and (iii) analysing the polynucleotides contained in said modified 
vectors upon transfer of said modified vectors into said selected host cell. 

10 

2. The method of claim 1, wherein the library comprises a plurality of unknown 
polynucleotides. 

3. The method of claim 1 or 2, wherein the library comprises a plurality of 
1 5 environmental DNA fragments. 

4. The method of any one of the preceding claims, wherein the cloning vectors of the 
library are E. coli cloning vectors, preferably cosmids, fosmids, PI or BAC. 

20 5. The method of any one of the preceding claims, wherein the selected vectors are 
modified by targeted insertion, into the vector, of a target polynucleotide construct. 

6. The method of claim 5, wherein the targeted insertion is performed in a region of the 
vector distinct from the polynucleotide. 

25 

7. The method of claim 5 or 6, wherein the target polynucleotide construct comprises an 
origin of transfer functional in the selected host cell. 

8. The method of claim 7, wherein the origin of transfer is functional in E. coli host cells. 
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9. The method of claim 8 5 wherein the origin of transfer is selected from RP4, pTiC58, 
F, RSF1010, ColEl and R6K(a). 

10. The method of any one of claims 5 to 9, wherein the target polynucleotide construct 
5 comprises an integrase functional in the selected host cell. 

11. The method of claim 10, wherein the integrase is selected from <}>C31. 

12. The method of any one of claims 6 to 11, wherein the target polynucleotide construct 
10 comprises a transcriptional promoter functional in the selected host cell. 

13. The method of any one of claims 5 to 12, wherein the target polynucleotide construct 
comprises a transposable nucleic acid construct. 

15 14. The method of claim 13, wherein the transposable nucleic acid comprises, flanked by 
two inverted repeats, the target polynucleotide construct and a marker gene. 

15. The method of any one of the preceding claims, wherein the cloning vector 
comprises a first marker gene and wherein, in step ii), the cloning vector is modified by: 

20 . contacting in vitro, in the presence of a transposase, the selected cloning vectors with a 
transposon comprising, flanked by two inverted repeats, the target polynucleotide 
construct and a second marker gene distinct from the first marker gene, and 
. selecting the cloning vectors which have acquired the second marker gene and which 
have lost the first marker gene. 

25 

16. The method of any one of the preceding claims, wherein, in step (i), the cloning 
vectors which contain a polynucleotide having a particular characteristic are selected by 
molecular screening. 



30 



17. The method of any one of the preceding claims, wherein, in step (iii), the modified 
cloning vectors are transferred into the selected host cell by conjugative transfer. 
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18. The method of any one of the preceding claims, wherein, in step (iii), 
polynucleotides are analysed by detennining the phenotype or properties of the host cell 
upon transfer or expression of the modified vector. 

19. A method for the identification or cloning of polynucleotides encoding a selected 
phenotype, the method comprising (i) cloning environmental DNA fragments into E.coli 
cloning vectors to produce a metagenomic library, (fi) identifying or selecting cloning 
vectors in said library which contain DNA fragments having a particular characteristic of 
interest, (iii) modifying the identified or selected cloning vectors into shuttle or 
expression vectors for transfer and integration in a selected host cell, (iv) transferring the 
modified cloning vectors into said selected host cell and (v) identifying or cloning the 
DNA fragments contained in said modified cloning vectors which encode said selected 
phenotype in said selected host cell. 

20. A transposable nucleic acid construct, wherein said construct comprises an origin of 
transfer and elements for integration and selection in a selected host cell genome flanked 
by two inverted repeats. 

20 21. A library of polynucleotides, wherein said library comprises a plurality of 
environmental DNA fragments cloned into cloning vectors, wherein said environmental 
DNA fragments contain a common molecular characteristic and wherein said cloning 
vectors are E. coli cloning vectors comprising a target polynucleotide construct allowing 
transfer and integration of the environmental DNA into a selected host cell distinct from 

25 E. coli. 

22. A polynucleotide sequence comprising all or part of SEQ ID NOs: 1 or 2, or of flieir 
complementary strand. 



15 



30 



23. An oligonucleotide comprising SEQ ID NO: 3 or 
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Figure 5a 
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76. .1134 

/note="ABC_transportr" 
/gene="TAP2 PROTEIN" 
/blastp_match="Oryzias latipes" 
/blast_score= 0.002 
complement (1096. .2430) 
/not e=" none" 

/blastpjmatch="Anabaena sp" 
/gene="ALR1117 PROTEIN" 
/blast_score=2e-18 
1178. .1624 

/note="Gram_pos_anchor" 
/gene="CELL WALL SURFACE ANCHOR" 
/blast__score=le-04 

/bias tp_match=" Streptococcus pneumoniae" 

complement (2506. .3567) 

/note="CONSERVED" 

/ gene— " HYPOTHET ICAL PROTEIN" 

/blast_score«0 . 019 

/blastp_match="Deinococcus radiodurans" 

complement (2906. .4222) 

/note="glycosyl transferase" 

/gene= " 1 ipopoly saccharide " 

/bias t_s co r e=2 e- 2 3 

complement (4092 . . 5321) 

/note="glycosyl transferase" 

/gene«"glycosyl transferase" 

/blast_score=le-15 

complement ( 6337 . . 8502) 

/no t e= " PUTAT I VE " 

/ gene= " GLUT AMINE AMIDOTRANSFERASE" 
/blastp_match=" Bordetella bronchi sept ica" 
/bias t_jscore=le-l 6 
complement (8181. .9530) 
/note s ="none" 

/ gene = "MEMBRANE PROTEIN" 

/blast_score=0 . 035 

complement (9531. .10721) 

/note="NOEC Transmembrane" 

/gene="NODULATION PROTEIN" 

/blastpjmatch="Azorhizobium caulinodans" 

/blast_score=3e-07 

complement (10504. .11274) 

/note=" PUTATIVE " 

/gene="HYDROLASE" 

/blastp_match="Streptomyces coelicolor" 

/blast_score=4e-14 

12874.. 13689 

/note=" HYPOTHET ICAL Meth-transf " 
/gene=" PROTEIN PA1088" 
/blast_score=*2e-06 

/blastp__match«" Pseudomonas aeruginosa" 
14195. .15976 

/ no te=" PUTATIVE, Glyco_transf " 
/note="" 

/ gene= " LI POPOLY S ACCHARI DE BIOSYNTHESIS" 
/blast_score«2e-0 6 
/blastp_match=»"Vibrio cholerae" 
15427. .16512 

/note=" PATHWAY: INNER CORE L I POPOL YS ACCHARI DE 
BIOSYNTHESIS" 

/gene="PHOSPHOHEPTOSE I SOME RASE " 
/blast_score=3e-17 

/blastpjraatch«"Helicobacter pylori" 
15579.. 16253 



WO 2004/013327 ^gppCT/EP2003/007765 

11/39 

/note="none" 

/gene=" PHOSPHOHEPTOSE ISOMERASE" 
/blast_s core= 3 2e-22 

/blastp_match="Neisseria meningitidis" 
CDS complement (16505. .17 656) 

/note="BIOSYNTHESIS PUTATIVE" 
/ gene= "LI POPOLY S ACCHARI DE " 
/blast_score=2e-17 

/blastp_match="Thermotoga maritima" 

/pfam_match="Glycos_transf_l" 
CDS complement (17 657. .18697) 

/note="none" 

/gene="ALR3073 PROTEIN" 

/blast_score=6e-27 

/blastp_match="Anabaena sp" 
CDS complement (18615. .19304) 

/note="none" 

/gene="ALR4487 PROTEIN" 

/blast_score=8e-07 

/blastp_match«"Anabaena sp" 
CDS complement (19301. .20596) 

/no t e= "AT P__GTP__A" 

/gene="ABC TRANSPORTER" 

/blast^score^e-ei 

/blastp_match="Synechocystis sp" 
CDS complement (20535. .21476) 

/not e= " PERMEASE COMPONENT" 

/ gene^ " POItYS ACCHARI DE ABC TRANSPORTER" 

/bias t_s core= 6e- 4 1 

/blastp_match="Clostridium acetobutylicum" 
CDS complement (22025. .22951) 

/note=" involved in the synthesis of a polysaccharide 
capsule ? " 

/gene="32.3 KDA PROTEIN" 

/blast_score=3e-17 

/blastp_match="Sphingomonas sp" 
CDS 23155.. 26523 

/note= 3 "peptide syntase" 

/gene="mcyA, mcyB and mcyC" 

/blastp_match="Microcystis aeruginosa" 

/blast_score=*0 . 0 
CDS 26409.. 34433 

/note="polyketide syntase et peptide syntase" 

/gene="mcyD, mcyE, mcyF and mcyG" 

/blastp_match="Microcystis aeruginosa" 
CDS 34418.. 37500 

/note="CYSTATIN" 

/gene=" PEPTIDE SYNTHETASE" 

/bias t_s cor e=*0 . 0 

/blastp__match= s "Anabaena sp" 
CDS 35359.. 37500 

/note="gene cluster" 

/gene="nostopeptolide biosynthetic" 

/blast_score=2e-41 

/blastp__match="Nostoc sp" 
Sequence 37500 BP; 6199 A; 12698 C; 12769 G; 5834 T; 0 other; 

gatcggtgcg ggcctcttcg ctattacgcc agctggcgaa agggggatgt gctgcaaggc 60 

gattaagttg ggtaacgcca gggttttccc agtcacgacg ttgtaaaacg acggccagtg 120 

aattgtaata cgactcacta tagggcgaat tcgagctcgg tacccgggga tcccacgtac 180 

cacgagctca tctggaagag cccgctccag acacccgagg aatggaagcg cggcgtcgtg 240 

agaccgtaca cgcagaagcg tctcgtcgcc ttcgttgact ggcttgcgac gacgatgaag 300 

ctggatctca ccaggatgtt cgccgcaggc agctcgatgg gcggatcggg cgcgatcatg 360 

ctcgcgattc gctatcccgc acgattcgcg tggaccgtgt cgtgggtcgg cgtccacgtg 420 

cccgccgact ctccgcaatt cacatcgtcg tacgagctgg tgtacggccg gcccgacttg 480 

aaggtgccgt tcgagaacgg cacgccggtc tgggatcatt ttagtgacgt ctggtacctg 540 

cggcagcatc cggagcagga catcgggt:tc atcacgttct cgaacggcaa gaacgactcg 600 

gcgatcggct ggcgccaggc cgtcgaattc ctgaagacgc tgcaggagac caggcaaccg 660 

cacctgttcg tctggggcca ggagggacac ggccagcgcg cgaagatgcc ggaaggtggc 720 
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ggcgaacggg 
tgcacgctcg 
atcaactcct 
gtcgtcatca 
cggcggctgc 
ggcaacaccg 
ccgcagatgc 
ggcgtcctcg 
gtccgcgcgc 
agccgcccgg 
aagtccccct 
cactttgttc 
gagatactcg 
tgcccgcacc 
gcgaaccacg 
gtagccgaac 
aaacatctcg 
ttcgagcagc 
gactttctcg 
cgccgccgtc 
gaaggggatc 
gagcaccgcg 
cgaatgcagc 
gagattccgg 
cgctcgatgc 
tcgtgcgagc 
gggcagccgg 
acgcgggccg 
cccgccgcaa 
caggtgccgg 
gtgaggtcgc 
aggacatcgc 
tcgtcgagct 
tcgcagtgtg 
gggatgagcg 
cgccgcacgg 
agtcccttgc 
gggctccgta 
cgaacgataa 
cgggattggc 
tgaccagggc 
cgatgggtgt 
acgactgcac 
ggaaacgcca 
ggctgccgtc 
aggcttcgat 
cgaagtcgct 
gtcggaaagc 
ctttggccag 
cgagaatccg 
atccatgggc 
gatcggtctt 
ggagcttcga 
tgccggccgc 
agcagacggt 
tctccggtcc 
cgattcccgg 
gcccgggcga 
actgcgccgc 
agcaggagcg 
ggaatcgaga 
atggcgagcc 
aaaaacacgg 
gacaggatgc 
tcgaggctgg 
ccgctcggga 
gcgcgctgca 
cccggggtct 
gcgtcaacga 



agatgccgct 
acgacgatcc 
acgtcacctg 
agctggcgaa 
agcagttcaa 
tcgtgcagcg 
aggtctcaac 



acgagccggc 
gcggcgatgt 
aacgcctcga 
gcgacggccg 
tgcacgcctc 
cgtacctccg 
gcggcggtcg 
gacggccaga 
acgccgcaga 
acgtcgatgc 
aggcgctcgc 
aagcggagca 
gacgccagcg 
ccgcccagga 
tcgggccgcg 
aggacgtgcg 
ccgcggcgga 
gaccactcct 
gtgtgaatga 
tgtgtgagca 
gtacgcccgc 
tcacggaatc 
cgccgatccg 
tcgtgccgtg 
gctccaggtg 
tcgcgccggt 
caagcacgcg 
agagccactg 
agtaggagag 
cggcgtcctt 
tcgtgtgacg 
ctcttcttcg 
catcgcctgc 
gtgcgtgccg 
ctcgaacgcc 
gagcaggtcc 
ggttccgggc 
gccggcgatg 
gagcaagtcg 
ggcctcgaat 
atggtggtcg 
gagctcccgc 
gcgatcgagc 
cgtggccagc 
gtactcgtgc 
ccagattccg 
tttcgcccgt 
gatcgcgtac 
gccgcccgtg 
cctaggcgcg 
tatcgtacgt 



gccggcgccg 
cggtaccgac 
cgagcgccag 
ggtgacggaa 
cgcgaagcga 
ggccgccgcc 
tatcgaagcc 
gccagtccac 
cccagttgcc 
ggagcagatc 
acgccagcac 



cgacctcaga 
gggagacggg 
gcagcctgac 
tcgaccgccg 
gctcaagccc 
cggcgaagcg 
gggtggaaac 
gcatcggcgc 
cccgtcgcgc 
ctgattccgc 
gggtcacgac 
tcgcgatcgc 
ggaccgcgcc 
ggtccatgcc 
cctctcgcgc 
agacgacccg 
cgttcggcac 
gctcgctcac 
agcggccggc 
cgcgccattt 
tcggctcgca 
tcgcggcaat 
tgaggggtcg 
cgcgcacgac 
cgtcgtcgtg 
gatgaaacgc 
gaaggactct 
ggcgatcacg 
cggaccgacg 
caccatccgc 
cgtgtccggt 
gacccagccg 
ctgcgaaaag 
gccgaggaaa 
cgtcgaccct 
caaggcgggc 
gaggctcacg 
agcgagtcgt 
atccggcgtc 
tcgatcgcgt 
ttcgacacga 
atggcttcga 
agcccgtgat 
ccgagctcgc 
acgagacgga 
aatcgtttct 
cccagtcctt 
atgccattcg 
cgaatctgtt 
gggtagtaga 
ggaatcacgc 
gcgtgcacga 
gtgtcgaacg 
tcgtccatgc 
ttggagcgat 
ccccgcacgc 
gcgtctggca 
gccggccacg 
gatcatcgcg 
aatccaccac 



cagatcgagg 
gctcacgcgc 
tcggggatcg 
ccggcgcagc 
accggcgcgg 
gtcttcggtg 
gtcaccggga 
gacggcgacc 
gtcgccctcc 



acgaaccaga 
tcggattcga 
tcgatcgtcg 
cggtcgcccg 
ggcgatcagg 
gtcgcggacc 
aggatcctgg 
cagttgatgt 
ctgcggcggc 
cacggagcac 
gttcgggagc 
aatcggcgcg 
ggtcacttca 
cacgagcatc 
gaaccagagc 
cgcttcggcg 
gaccgacgcc 
gaccgtggtc 
ctcgcggcgg 
ctcggagtcg 
ggcgtagcgg 
ccgtcgttcg 
atcgcccggc 
gtcggtcgac 
cgcgagcgag 
gcgaatgcga 
caccgtgcgg 
cgggcgggaa 
tcagccatga 
ggtgtgcctg 
ccgctgggga 
agatggcatc 
atggtgccga 
gccgcgcgaa 
tcgagggccc 
aggacgagca 
cccactgcag 
agagcgcttc 
gcgccgcaat 
ccgcgaggcc 
gctcggccgt 
gcacggcgtt 
ggacggccga 
gggcggcctg 
ggcgcggcca 
gcggctcgag 
cacgcgccgc 
ggatcgtgac 
cggacacggc 
cggtgcgttc 
gctcggcgcg 
tgtcgatgcg 
aatgcttctc 
cgaacaccgg 
cggcctgcgc 
tgcggagctc 
gcatggcccc 
acgagcgact 
gctgcgcgca 
ccgccgcgct 
accgggatga 
aggttgtcgc 
ccctggtgca 
cgctggcaga 
tcgaaggcct 
ctctgcccac 
gcttcgtact 
accggccagc 
gccgaccggg 



gcctgccggc 
gcggcgtccc 
acgaacccgg 
ccgcagtcga 
tgacatggac 
ggtggggtct 
tgagcaggaa 



gcccacgaaa 
tttgccagca 
gccggccgtg 
cccgcggcaa 
accgcgatcg 
atcgtgggat 
agccgcgcgt 
gccgcgcgct 
gccggtggtc 
gtgaggccgc 
gccgccgcgc 
aacagccagc 
acgtccacca 
gccatcccgc 
agcacctggt 
agcgccagac 
gccgtcacgc 
acgaggtgca 
tcgccccggt 
catccgatcg 
tcggcgcggt 
ccacggcggc 
gctgctcacg 
tgtggacgcc 
cgggtcccac 
aggcgaccgt 
ggtattgtcc 
ggtcggcgcc 
ccagcgcgat 
cgcgcgcccc 
cactttgcgc 
cgcacgggtg 
tgacgcccgg 
gcccccggcc 
gggcgtccct 
gacgtcggac 
ttcctcgagc 
cttcgcctga 
acggccaacc 
ggcctcagat 
gacccgatcc 
aatgacggca 
gcgaaacgaa 
tcccaggagc 
cagatctttc 
ccggacctcg 
atcgcgctgg 
cgttccgagc 
gagcacgcgc 
cccggccgac 
gtgcgatctc 
agccggcgaa 
gccacagcgc 
tcacgctggg 
ggccgcgcca 
cgctgcggag 
tatccacgtc 
cgcgcttcac 
gacggcgaaa 
gcgcgccggc 
gcgggtcgat 
ccgcgcggtt 



gttcagccgt 
ggccgggcag 
ccgatggagc 
cgtcacgccg 
gaactccgcg 
cgtcacgctc 
ctaggcgggc 
gctggctgag 
aatccagcac 
cgatggtggg 
tcgcttcgag 
ccgatcgcca 
cggccgcgag 
ccgggcgggc 
cgttgggctc 
ctttcggcgc 
ccgtgacttc 
gccgcatcgc 
ctttcggcca 
tgtccaggac 
tgcagtacgc 
agatctccgg 
cgctcgcgat 
cggcgagatc 
cgtcggcgga 
tcggggcgta 
ctctcggatg 
cacgaccgac 
gctgccgatc 
gatcggccgc 
ggcgccgacc 
gtacacatgc 
gaccgacggg 
gatgacgccg 



gaggaagggc 

gccacgggcc 

gcgtcggacc 

atccgccgct 

cgtccggccg 

ccgggctgaa 

gcggtcgcga 

tcgtagtccg 

gtgtggccga 

ggacgccgga 

agggcggcac 

gcgccgatga 

cgtggatcgc 

ggccgtccgc 

gggaagcggg 

tggccggtcc 

agggcgagga 

acgagcgcgc 

acgtagtcga 

tcgcggatgt 

agaatcgtct 

acgggacgcg 

gagcgcgatg 

gacgagtccc 

accggcaagc 

gaggcccagc 

cagatcccgc 

cagctcgccg 

ccgcgggtcc 

ttcacaggtc 

ggcaagattg 

cgagtcgtac 

agcgacgacc 

ctcatgatcg 

gcgaagcgtc 
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1380 
1440 
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1560 
1620 
1680 
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1980 
2040 
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2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
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2760 
2820 
2880 
2940 
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3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
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3840 
3900 
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4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
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gccacgttca 
gagtcgtcgg 
gtgcgatgaa 
acgaacgaca 
gctttcgtag 
cccctgctct 
cgatcgcctc 
actctgcggc 
ggatctgccg 
cgatcgtcca 
cggtggacgc 
tgttccgcac 
tgatccccgc 
cgatgagcgt 
ccgattgcgg 
ccaggatgtc 
agcacgccac 
attgctgtcc 
cgccgcgcac 
cgtcagtggc 
gaatgacgac 
tgctggccat 
tgacgcgttc 
tcgccggcgc 
gaagcgccac 
gtgttcgcgc 
tctcccgtga 
cgtgcgcgcc 
ccagcgatcg 
gttgaccttg 
cgagttccgc 
cgtgtcgtcg 
gaacggcagc 
ccgatgatgc 
gtcgaggagc 
tgtgaagagc 
gtcgatgaac 
cgccaggctc 
gtcccgcagg 
ggtgagcgag 
ctcgaagaag 
ggcgcagccg 
cagcacggcg 
gagcgacttc 
cgtcgtacgc 
gcctgacggc 
cagggtcttg 
gagagccggt 
tccggccgcg 
cactcgctct 
catggcaggc 
cgaaccaaac 
cggccggctg 
gaccgcccat 
cgccgctgac 
gacgccgagg 
ggtgtcgtct 
gcaatcgaca 
ccgcccacgc 
gccagggccg 
aggagagtga 
cacgcgctgt 
taggcggccg 
agcgcgaccc 
tcgctcgcgt 
tccgtcatca 
ttgcgcttcg 
cgggagtacg 
ttggcccatt 



cgtcgggcag 
tcgacccgtt 
tcgacgccag 
cgcgcggcgc 
cgcggcccgg 
cggccgccga 
cctggcggct 
ggcgttcgtg 
gctcgtcagg 
cggcgtcgcc 
gttggcttcg 
gatgccgcgc 
ggccaggccg 
gtcctgcgag 
gccgcgaatc 
cacgccgtcg 
gaggccatcg 
cgcatcgatg 
ctgaatcgcg 
gccaaccgag 
gcggtcccga 
ccggtcgagc 
gagcgcctgc 
cgcccatccg 
gagtgcgccg 
aaccagagct 
cgcgtgtcgt 
tccggggcga 
aagtcgtgat 
tcgaacacct 
accttgagca 
cgccatgccg 
cgcacgtcca 
tcacgaagat 
gcgtgcagcg 
gtgtccagcg 
tcgccgcggc 
gtcttggcga 
aaatcgatcg 
accatgcgcg 
cggtggtcgg 
gcgacgccca 
cggctgtcga 
tcgaccgcgc 
ggtccgcggt 
cagtcgtacg 
tcgcccagga 
ttgcgcgccg 
tcgacccgcc 
cgcgccggat 
cccagccgtt 
tcgctccgga 
acgtcgcgcc 
cggtccgacg 
gcggcgtcgg 
aggccgttca 
gagcggccgg 
agccgaggaa 
agaatgcaat 
gatcccgctt 
gcaggaacgc 
gaaccgagcg 
tgaaggcatt 
tccagaagtg 
cgatgctcga 
cgagcgacat 
agtacaggaa 
tgctgatgcc 
tcctcgcttc 



cgacagcact 
gtccgccacg 
gcaacggctg 
cacggcaatt 
gttgggggcc 
tcgaaatcga 
gccgacaggt 
acgttgttcg 
ttgttgcgga 
gtgccgagcg 
atcgcttcgt 
tggtgatcga 
atggcgatgc 
ttcgcccagg 
cgcaggaaga 
gtgtaatcgg 
tccgcgaagc 
agctggacgt 
tggtacccga 
agcgccacgc 
tcgccggagc 
cggtactcgc 
agcagctcgg 
cagaagtgcg 
gccacgatga 
cgaggatgag 
cgatgatggc 
acagctcgcg 
agtgacggta 
tctccgccag 
gcgccgggtt 
gatcgccgct 
cgacggtgcg 
agagatagct 
agtcggccgc 
gcaggccgct 
ccgtcatcgc 
gctctccgcc 
ccagcatctc 
cctggttcgg 
taccggtgag 
gcgtgtacgt 
ggccgccgga 
gcctgaacgt 
agagatcggc 
cgagcacggc 
cgaagcccag 
aggtcctgag 
agtagatcgg 
cgatcacggc 
cgtacaccaa 
gcgccgcttc 
cggccgccac 
gatctcgata 
cgcggctgaa 
tcagacgacc 
aaccggccgc 
gtgccagagg 
gagggcggtc 
cgagacccgg 
cgcgaggccc 
ccccttgccc 
gtgaccgacg 
cagccggctc 
gtcggacgag 
cgcgacgaag 
atagacgagg 
gcgatagagc 
cgtctgcatc 



cgcgcggatc 
acgatctcgt 
aggtgcgccg 
ttcgcggtca 
gcgagaactc 
gcggcgcctg 
gcagatcgca 
tcgcctccgc 
tgagcccgtt 
tgaagacgct 
cggcccattc 
tctgcttctc 
ccctgaacga 
cgaggatcgc 
cgttgtcgcg 
acggcgcgct 
gtccgtcagt 
tgtgcaccgt 
cgtggccgac 
cgatcgcgtc 
ggctccggag 
cgttctcgag 
acgtccgccg 
gcaccggcgc 
ggcgagtcat 
caggatttga 
gcgcaacgtc 
ctcgacggct 
cccaggcgcg 
cggaccggcg 
gccggcctgc 
cagcaggacc 
gaagagatcg 
gctgagctcc 
gccccgtccg 
gctgatgtag 
gaacacgtgg 
gtggccgcgc 
ggtcagtccg 
cagaaagtcc 
cttcgacagc 
gatgagcccg 
gagcgacagc 
cgtgtgcacc 
cggctcgaaa 
gccggggtcg 
cgtgacgtaa 
cacggaagcc 
atacgagccg 



gagcgaaaac 
agccaggaga 
gttgtagaga 
ctcgtcgagg 
gacgaccgtg 
ggcggtagat 
ccgcgggcag 
gcggcgtgcg 
atctcggtgt 
tggacggcca 
cggaccctgc 
acgaacccaa 
caggcgccgt 
ccgtaaatcg 
tgggccgatc 
ctgatcgagc 
accgccgccc 
ccaagcacgg 
accccgatga 
agcgtgacga 



cgcgctctgc 
accagccagc 
cgccgttctt 
tcgatttcgt 
gaacgcgccc 
cggaatctcg 
cgaggccgcg 
ctcaccgccg 
cgtcacggga 
gttgtgctcg 
gttcaggttg 
gccgtcgcgt 
atcgacgatg 
cgggccggat 
gatgacccac 
cgtcgtgtac 
gccgacgctg 
ggctcgggac 
cgtcagatcc 
gccggtcatg 
cacggtgccg 
cagaatcgtg 
cacggtgacg 
ggcggcctgg 
gccgccaccg 
aggaggtagc 
tcggggcgca 
tcgagcagtc 
ttcaggcgct 
tcgccgcgcg 
gtgatcgcgc 
tgcaggaact 
agcgagggaa 
gcgggtgaga 
gagaggcgcg 
ttggcgcgcg 
ccgtcggtgt 
agcaggacct 
tggctcagat 
ttgaggtacc 
tgttgcgcga 
ccaacaccgt 
ccgaagcgat 
gcgtggacgt 
taccgggaca 
aggagcgtga 
tcggcggccg 
agatccgatc 
aacggatccg 
gcgccgtcga 
tcggcgcgcg 
tcgccgcgaa 
aagccgagcg 
cggcgcggcc 
gccggctcat 
acacgcgcga 
ccgccaccat 
actgaaagat 
gcgcgtacgg 
ggcaggcgat 
gctcagccac 
ccgacgtgtc 
gccggtcatc 
gctgatcgcc 
tcatgcgatc 
cggccacgag 
cgagcgtgag 
gcagcacgcc 
tgagcggcac 



cacggcacgc 
cggcgcctcg 
gacggggatg 
agcgcggccg 
agatccggcg 
atgccggcgt 
tttgcgaacc 
ttacggcgca 
aaccggacgc 
atccggaagt 
cagatcacgt 
gcgtaggccg 
acgttccgtt 
cgccagccgc 
ccccgggccg 
tcgaagcgcg 
cctttgagca 
gcgccaagct 
gcaatcgtca 
ccgccgccgc 
gggacgttga 
gtgcgcggct 
acctgtttcg 
ccggacggca 
tgtcgacgtc 
cgcgcgactg 
ggaacatccg 
gcgtctgcag 
tcagcagggt 
cgccggtgtt 
gatgaagcgc 
cgcgatccat 
tcgtgaagcg 
gctccacggt 
cagcgtctgc 
acgtcatgta 
ggagcggcca 
cgataccgga 
acatgccatc 
ggtcgtcgag 
tcacctgatc 
tcaccgcact 
gcctgcccga 
acgcgtcctt 
cgctcgcggc 
cgcccgccgc 
cccggaggct 
cgaacgacag 
ttgcaagcag 
gccgggcggc 
tcccg-ttcgc 
agatcaccag 
tgccgagccc 
tgcccgactt 
cgcggcaaac 
cggaacaaga 
ccggttgagc 
gacgaacgtt 
cgccagctcg 
gagcgccagg 
gagaccgaac 
gtagtcgttg 
ggccatgcgg 
aaccgccgtg 
ccagaactct 
gacgagcatc 
gaagccgccc 
cagcgcccac 
gagcatgaac 
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atgccgacgg cgacgccgtt gttgtcaccg agcgagtaga cgtcgttgaa gttcttgatg 9060 

ccgggaatga acaggagctg accccagccc tgcttggcgg cctcgaaccc gagggagagc 9120 

gagatcacga ggaagacgag ccgcaaccgc ttgacgtccg tcgtgaggac tgcgagcagg 9180 

tacgtcatca cggtggactt catgaagtcg atcatgtagc cccacgcgaa gtcctggtag 9240 

ggcgacgcca gcgtcgacga caggctgtgg agcccgaaga aggcgaggag cgccagccgc 9300 

gcgtcgatcc gcagcgcctg cccggacagg aacgcgatcg cgagcgtgta catgcccgcc 9360 

aggagcgaga tgttcatgtg gatgagaaag tcgctccaca cccagagctc cggccggaag 9420 

tacgcgatga acaggtagaa cagaatcgcg tagaacggcc cgcgcagcgc gtggaaggcg 9480 

ccgaaggcga ggagggcgag gacgaaagcg gctcgaagca ttgccgtgtg ttacctgtta 9540 

tcggtcccga tggtccagag cggctgcgtc cgggtctcct ctacattgaa caccttgtag 9600 

agggccggaa tcgaggtgaa catgagcacc acgaaaagaa tcgccgacac caccatgtag 9660 

gcgaagaacc cccgctgctt gtagagcttc tccggattct gcacggggct gttcggcagc 9720 

atcccgatgt gcaggtagta ggcgaacatg cccgccgcca ccggcgcgaa caggatcagc 9780 

tcgaggtgat agcggacgat gaagatgccg ccgaagaggg cgcagcaggt ggcgtagaac 9840 

accatgctca cgagcagccg ctcctcgttg tagaaggcaa acgacttccg gtacgaggcc 9900 

gcgaccgacg cgttgccgat gtgccggaac tccgcgtacc gcttcgtcgc catgaagaac 9960 

gcgcccacca tccagtacga gatcacgagc gacatcggcg gcacgcgatc ctgaatcagc 10020 

gggaaccagc cgaggaggag ccggacggcg ttgttcgccg actcgctcag cacgtcgagg 10080 

tacggccact ccttcgtgcg aatgggcggc acgttgtacg tgacgccgag cacccagagc 10140 

gccagggccg acagcgcgaa gtatctgttc acgagcagcg cgagcacgaa tccggccacg 10200 

cccacgagga tccactcggt gtagcccgcc gccggcttga tcttcccgga gggcaccggc 10260 

cggtgccgtt tctccggatg cagcagatct ctcgggccgt ccaggagctc gttcagcacg 10320 

tagttgctcg aggcgatgag gcaggtcgcg gcgagcgcga gcgcgagcgg cgggacggcg 10380 

gtccagccga agagctgcgg ttcgtagaag aacgccagca gcacgccgag cagcatgaac 10440 

gcgttcttga accagtgatc gatccgcgcg atctggacgt acggccaaat ccgcgaaccg 10500 

gagctagcgg ctgacgacat cgttgagctc ctcgagccgc cggatcgcga accatccgcg 10560 

gccggctgtg acgccgtggc ggccggcgat gagggcgcag gagagtcctg cggccatcgc 10620 

tccttctccg tccacgtccg gtcgatcccc gacatacagg acctcccgag gtccgagtga 10680 

ccacatctca cacgccacat ggaagccgcg cgggtgcggc ttcagggcgt tcacgttgct 10740 

ggcggtcgtg cagagcgcca gcgaaaaatg ctgcgacgtg ccgagcgcgg cgagcttgct 10800 

ctcgggcgcg tagtccgaca gcacgccgag cctgagcccc gccgcgcgaa gcgagtcgag 10860 

cgcgccgagc aaccccgggc gccggcacca ccgcagatac -ttgagcggcc ggcgcaccat 10920 

ccattcgttc accgtcgccg ccacgacctc gttgtccagc ccgagcgcct gcgcggtgcg 10980 

cgcgatctgg cgcgccgcga gcggctcgtc ggccgcgccc aggcgccgga gatcctcatg 11040 

cgcccggcgg tactcccgca cgatgcgcgc cgtgtccact ccgcgccgcc accggaacgt 11100 

cacgagcggc gtggcggcga gctccgctgc catggcggcg cgcaacggac cctgccggta 11160 

cagcgtgccg tccacgtcga acagcacggc cttgaccccg cgcaacggct tcactctgcg 11220 

aggcccagca tccgcagaat cgccgacacc tttgacgcgt cgcgaaggaa cacgtcacga 11280 

acccacacgg cagccagccc cgccaacagc accgccgctg ccgcgcccgc tgccgtccgc 11340 

cagctcccgc gccaggcgcg cgtgtcgatc gcggcggcgc cgtagcagag cagaatcggg 11400 

acgagcggca ggtgataccg gggatgtccg aagacgatcg cgtgaatgcc gctcatgaac 11460 

acgatcaggc tcaggagcag caggtgcgcg cgccagttga ccggccgcgc gaacgagatg 11520 

ccgatgacgc cgagcaccat gaccgccgcg tacgatccca tgatggacac cgccccggcg 11580 

aagaacagcc accgcggcgg acggtagaac ccatttccga cgccggcgat gaagtcgcgc 11640 

tccagccccc agaagtctgc gaacttcagc accgaacggc gaagcgtcgt tcccggatgg 11700 

gccttcatgt agtcgagcgc ctggcgctgc gcccacttct ctttcgtgcc ctccgtccag 11760 

cggctgccgt ccggcgcgcg aggcggcatc gtgcgcgacc attccttgtc gcccgtgagc 11820 

gagatcgcgt cccacatgcg atcctcgggc gtgtacgcgt agttgcccat catcaggttc 11880 

agcccgccga gcgtgtccac gaccgtgaac gagcgttgca ggagcgtgtt gcgcacgctc 11940 

cacgggccga cgatcgcggc gtatccggcg aagagaagca cggccacgcg cagacgcgcg 12000 

cgaagcggga tgcccagacc gacgagcgcc gcggccatga ggatcgcgac gtacggccac 12060 

atgatcgagc gaacgagcgc ggacgcgccg agcgcgacgc ccgtcgcgag cgcgatccac 12120 

gccgagtgcc gccctgatgg gcgatcgatg agcgtcagac atcccaagag caacaaaaga 12180 

aggagtgtga tgaagagcgt ctcggacagc accagcacgc cggagaacag cagcgaggga 1224 0 

tagaacgcga acgccgccgc ggcaatgagg ccggcgcgtg cgctgaacaa cctgcgcccg 12300 

atgagataca cgagccagac accgagcagg ctcaacggaa cctgcgccag ccggaccgcc 12360 

gtcagcgatt ccgagcccgt gaccgcccag acgcccgcga cgaacgcggg aaagagcggc 12420 

gcccggatgg acgtcggttc ccccggcccc cacgcgaacc cgttcccggc gacgacgttt 12480 

ctggcaagcg tcgcgtagtg ctgctcgtcg cgaatctcga gcgcgacgtc cttgaacccc 12540 

caaatcaggc cgagccggag gaccagggcc agcgtgagga tgacgagaat tgcccggcgc 12600 

tcccaatggg tgcgctcggc ctcgtcgcgc ccgggggcga cgattggcca gggggccgac 12660 

gatgggagtg ggggagcctg catcattcgc taaaacaagc agttacggtc aaacctgtgc 12720 

caccggggaa tcagcaaaga ctctgccaac gggtccgccg gcggcgacgc ccggcgcgcg 12780 

ttttccggcc ggaagtcggc ccccaaggcg acccggctgg gcacagccca gtagaaatct 12840 

cattctggga tgacttttgt ccatgatcca tgaccatttc gaggggcctt cgtgacgccc 12900 

tggggatggc ataagcttgg cacctgcgat ggcgacgtgt cgatgatcag cggccgtgag 12960 

gagctccaga aggcctatcg ggacgaccgg gtcgcccgtg aatacgtggc gcggcggttc 13020 

cagtcaccgc tgggggcgct cctccattcg cgtcagatcg gggtcgtgcg agagctcgtc 13080 

cgcgcgcagg gcatccggcg cgcggcggag atcgcacccg gtccagcgcg gctgaccgtg 13140 
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cgaacaacag ttgcagtcga "cgcgx « 
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ccacgaactt ctcccgatcg accgcg g 
gtccctgctg cgccgcgtcg ^atcacgg 
gcggcgggct cttcaggacc caggcgatct 



ctcgtcgacg 
ctcgccgcgc 
ttcgatctcg 
ctctaccgcc 
aatgagattg 
gatgcgctgc 
gtgtcgctca 
gcgccgaggt 
gaaccgctcg 
atccgatcaa 
acgcgacggt 
tgctcgtgtt 
cgcgcggcga 
tggggcggcg 
gggcgatcgc 
gtatgtctcc 
cggtgccggt 
ctgagatcga 
tcgacgtgct 
acgccctccg 
cctgttacgc 
gtccgaattt 
acggcatcgc 
cgtcgctcgc 
cgcgccggct 
tgtacgtgga 
gggactcgcc 
ccgccatcga 
gctcggcggc 
gcgcgcgacg 
cctgacgtcg 
cacgctccag 
cgtgccgccg 
gctctcgagc 
gtccacgccg 
gctcgtcccg 
acgagcggcg 
cctcgggtcc 
ctacgtgaag 
gtcccaggtg 
ggacgtgatc 
gttttccctg 
gccggtcatc 
cgagcgcttc 
gttcttctgc 
gttcgtgttt 
gctcaagggg 
cgaagcgcaa 
aggcgctcga 
cgcagcattt 
ccgtggcgct 
agaccgtgtc 
acgacgtccg 
atgaaacgaa 
tgccaggtca 
atccacatct 
acgtgcgatc 
cggtgaggcg 
cctcggcgtc 
cgcccgggat 
gcggcagccc 
gcgcgtcctg 
ggcgcctcag 
actcgaccgg 
ctttgtgtgg 
cggccgtacg 
gaatcgcggt 
cgatgtagcg 
cctcgggatc 



cgagcgccca 
gggccaatct 
tctacacctt 
agattgcgac 
tgtctgcgcc 
tcacaccggc 
ccggtgcgca 
caacggtgct 
agtggatcgt 
agaagcgatt 
cgtgtcgttc 
gtccggccgc 
cgtgacgcac 
gccggttctg 
gcacgtggcc 
cggccgcgtg 
ctcgtcggag 
gccgcgcggc 
ggtgccgtgg 
tccgccaccg 
gcgggcgcac 
cgtgctcgag 
ggagatcctc 
cgcgagcctg 
cgcagaggag 
ggtggcatcc 
ggacggctgc 
tcgcagatga 
cgcgtgaacc 
ggcgtcctcg 
gacgcgaccg 
gatctgtcgc 
gcgagctttc 
cgcagccgtt 
acgaacatcc 
tacgacacgg 
gcacgatcga 
aactcggtcg 
cgcctcgtgc 
cgcaacccgg 
gatctcgtgg 
ctcatcagtt 
gggctgtacg 
gtggccatca 
tcgacgtgcc 
ggcgtggacc 
agctccgatg 
agcgctctcg 
ggccggacac 
cgccgccgag 
cacgaccgac 
agccggcagg 
gcaactcgcc 
tcgcgctcac 
tcgtcccgtc 
gggcgaagtt 
ttcgaacgtc 
gcggcggagg 
gtccgcctgg 
ggtggagccg 
ctcgtgcgtt 
gcggaagccg 
cgcctgctcg 
caggccgcgc 
cgcaagattc 
cgcggcttgc 
gaccggcacc 
cgcacacgtg 
cggttcgatg 
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13800 
13860 
13920 
13980 
14040 
14100 
14160 
14220 
14280 
14340 
14400 
14460 
14520 
14580 
14640 
14700 
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15120 
15180 
15240 
15300 
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15420 
15480 
15540 
15600 
15660 
15720 
15780 
15840 
15900 
15960 
16020 
16080 
16140 
16200 
16260 
16320 
16380 
16440 
16500 
16560 
16620 
16680 
16740 
16800 
16860 
16920 
16980 
17040 
17100 
17160 
17220 
17280 



15/39 

gatgcttcac 13200 
cgtgcaggcc 
caggctcgtg 
gattctcgga 
gcttcgcgcc 
cgggctacgc 
tcgccggttc 
cgcgcgcgcg 
cgtatgccgc 
tcgaaggaag 
tcgccggccc 
gcctggatcg 
gtcttcggcg 
ctcaccgccg 
gtcgaagtcg 
cacctcattc 
agattcacgc 
attccgctcg 
cgccggtggg 
aacttcatcg 
gccaccgtcg 
ggattggcca 
gatcgatcgg 
gacacgctcc 
acgttcgacc 
gccgccgctg 
tcgggtgggc 
agcgcgacgc 
ggatcctcat 
gggccctgcg 
tggccgtcat 
gcctgtccga 
tcagaacgtc 
ccaggatcat 
tctactgcat 
cccgctaccc 
agttcgcgtt 
ggtgcggcgc 
gtcacttcga 
ccgaacgcgc 
acaagacgtc 
gcgactcgag 
tcaacgacgg 
acagcacgcc 
gcgacccggc 
gcattccgct 
caggacctca 
ctcgtcgatg 
aagatcctcg 
ctcgtgggcc 
acgtcgatcc 
tgagtgcctc 
caacatcctc 
cggacgcggc 
agatcagacg 
gatcgagaaa 
gaccaggagt 
•tgctgatcct 
acgagaaagc 
atcacgggca 
gacggcagca 
agaaagcgca 
tgttccttgc 
gacatcagca 
gcgagcatga 
cggtcgccgg 
ttgtcccggc 
acgatgtgct 
tgaaagtgca 
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cggccacgcg 
tgtgcacgtg 
cggcgagctg 
cgccctccgc 
ggacgccgcg 
ccagatgtga 
cggccgggcc 
gcagcaggct 
gccgcatcgg 
cgtactcgag 
gcgccgactg 
gcatcacgat 
ccgggatgaa 
cgccgtggat 
acgaagcggt 
gaagatcgag 
tgaagctcca 
cccacgagtc 
gattgcgggc 
atctcgccag 
cgggctgata 
tgaacaccgg 
gctctcgagc 
acgtttctcg 
gatgccgggc 
ggctgttcgt 
agcggcgccg 
tcggcgacct 
agatcgaatg 
gatcgctcgt 
gtgccgcagc 
acgagcacgg 
tacgtccatc 
caacgccccg 
cgacgcgaac 
cccacgatcg 
tctgggtcgg 
cgcggaaatg 
gccggtggat 
cgaacgtgac 
tccccggcgt 
acacgatctc 
acgcgtcgag 
catgctcgca 
cctggcgctt 
cgctcaacac 
cgagccgcgc 
cgccggcaaa 
tgatcgcacc 
cgacttcgat 
ggatcttcgt 
cgagcgcctc 
cgtccagcgc 
cccgcaggct 
cgggggccgt 
cagccgaaga 
ggctggttca 
tgcagggccc 
aacatccaga 
gccatcgaca 
ggcagccaga 
accacggagc 
tccctgggaa 
aagcggagcg 
tacgcgaccg 
ttgatgagcg 
agcaggtcgc 
atctcacggc 
ccgggcggat 



cgcgcgcgcg 
gaccaggggc 
ttgcaggccg 
gtcgagcgcg 
gcgtttggcg 
gacgaggatc 
gccagcctac 
gcgaagtccc 
caggcccgcc 
gtggccgagc 
atcgctcccc 
ggacgacttc 
accgaccggg 
gccggtcgcg 
cacggcgttt 
cgtgcgcagg 
gtcggcgtgc 
gtcggcgtcg 
gccggcaagg 
gatgtctgcc 
ggtctgggag 
gacgatcacc 
ggcttcagcg 
aaagccacgg 
cgaggtagct 
tgatgtggcc 
tgttcgtgag 
cgaggctcat 
agttcagcgc 
agccggccgc 
cgaagtcgtt 
gcgccgcctc 
gctcgaagcc 
accagtagag 
gcgcggcggc 
atgttccttg 
gttgtgaaac 
gaggtcgagc 
gttgccgtcg 
gtcgtccagc 
cacttcgctg 
gacggggctc 
gacttcgatg 
gagggtcgcg 
gaactcctgc 
ctcgtcgatg 
gttcatgccc 
ctcgacgatc 
ctggaggaac 
gagcgcgccg 
cagcaccttg 
gccgcggtgg 
gtcggcggga 
gtcatgacgc 
cacgtcagac 
cgagcgtcac 
ggagcagcac 
actggagctt 
cggcgatgac 
ggatcagggc 
ggatgttcgg 
cgatcgcgaa 
agtacacctt 
ccgaggagaa 
gcgtgtcgat 
gcatgaacac 
gccgcgtcat 
aatcggccag 
catgagggca 



gcctgcagcg 
cgcccgacga 
ccggcccgca 
gcgcccgcag 
gcggccgaca 
tcgtgcaggc 
gttactcgcg 
gacgggagga 
tgaccgaacc 
gcgagcagga 
cgcgccagga 
gacatgttgg 
cccttcgccg 
aacatctcct 



cgcaggacga 
acctcgccgg 
tgtcggaaga 
agcagcgcca 
ccccggttgg 
gtgccgtcgg 
agcgcggact 
gacacgaggc 
gccatggccc 
agacaacggc 
gaacccgcgt 
cattccgccc 
cacggcgacg 
ggagaggtcc 
gacgccgagc 
cttcacgcct 
catcgatcgg 
ggcggcgacg 
cagccgttcg 
ccgcttgagc 
gcatgcagcg 
accgccagcg 
acgtgcaccg 
gagaactcgc 
tagacgacga 
ggctcggtac 
tgcgtgcccg 
ccgacactcg 
gccgtgccgg 
accgcctgca 
atgcgatcga 
atcagcacgt 
gacgagtagc 
tcgtcgaacc 
acgttctcgc 
gcgcgccccc 
agggtggtcg 
acttcgaacg 
cgcggttgaa 
tcgccccgcc 
gttctccgcg 
gaccgagacg 
ggcgcggaag 
gccgcccacc 
gacctcgaag 
gatggccgtg 
tccgaccggc 
gtccacggcg 
cgagacgaga 
gaagttccag 
ggccgcgacg 
ggcccacccg 
ctgcatcagc 
gaaggcgcgc 
ggtagcgctc 



cgggtctcat 
gacccaatcc 
gcgcgttcag 
gaccctcgcc 
accggatcgc 
ggggggactg 

tcgcaggtcc 
agcacacggc 
gtttccgcgc 
gacgatggcg 
accgctccag 
tgtcgtgcag 
agaggcgcag 
gccggagcac 
ggtcggggag 
cctcgtcgat 
gatcgacctg 
cgaattcccc 
cctgagagat 
tcgatccgtc 
gcacggccct 
cggggatatc 
tctgcactgg 
gcgctcttca 
gccgcgaaga 
tggcccggct 
agcttctcgc 
gttgcgtccg 
ttttccgcgg 
cgtgcgagga 
atgccgggtc 
tgcccgaaca 
ccgatccggg 
tgggtcacga 
acgccgagaa 
tcgcaagctg 
acaggtagta 
ccatgaacgc 
gctgatcggt 
ttctcgcggt 
acgactcgtc 
gctgttccga 
cggccttcac 
ggttgtgcga 
agcacttctg 
cgggccgcag 
gcttgacggg 
gctcggcgat 
gcccggtgag 
ggacgatcga 
acttgcccgc 
agacgtttct 
acagcccgcg 
ggaacttctt 
aattcgaact 
atcgccgtga 
ccgtcgatga 
atttcgaccg 
aggtatttca 
gtgaatatca 
acgtggtagt 
ctcacgatca 
ttcgtgttcc 
acccacaggc 
cgcatgaaga 
aagcccatca 
agctcgcgat 
atcgtcagcg 
gagctgcccg 



gaatccgtac 
cggaagcatg 
gtcatacatc 
tgtaatccat 
cagcaccgcc 
attcatggtg 



gcccgcgcgg 
gtaccgcgtg 
cgcgtgaaac 
atgatcgctg 
cgtcctgacc 
cctgtaatgc 
ccagaggtcc 
ggcggctttg 
acagcgcccc 
gacggcgcgg 
ccggcgaatc 
tcgcgccgcg 
caccgtgatg 
gtcgacgacg 
ggcgatgtag 
ggtctgggcg 
cgtcgcgagc 
tccgctcccg 
tcttctgcca 
gcgcggcgct 
caagcggcgg 
cggccgtgga 
cgtaccgtcg 
acgcgtgcac 
gcaacgccag 
gatccatgac 
acatcttcag 
tcatgggttc 
ctccacatcg 
ccgcgacggc 
gtggccgcga 
ggacgcgtcc 
ggagcgttgc 
gacctggagg 
gccgacgaag 
tttcctggac 
cgagccggcc 
gacgaggacg 
ctggaacgtc 
gtgcgcggcg 
cgtgtccacg 
ctcgacccgt 
gtcgggatgg 
gcggccggtc 
cccgtttgcg 
gaccgcccag 
cacgagcgcc 
ggagacgttg 
cggagcgatg 
cgcccagcat 
tcggcgtcat 
ggtagaccgt 
cgtcgcggta 
cgagcacggc 
agatcatgag 
cggcggagaa 
cggtcaggga 
cgcagaacgc 
tcacggagaa 
ccgactgctt 
actcgaactg 
gtgtcagtcg 
aagagcaact 



acgacaggat 
cgcgcacacg 
cgccgccgga 
gccgtgcacg 
gcgccgccga 
gcgcgcgatg 
tagagcgagc 
agatcggtgc 
tcgccgcggt 
gcggcgcgat 
atgccttcga 
gtcaggggtt 
cagtcctcga 
accatgaccg 
tcgagcggcg 
cccgtatgaa 
ttggccggat 
gcgatggcgc 
gcgccgccga 
atcacttcga 
cgctcgccat 
gccggcgagg 
gaacagcatg 
cagcgtctgc 
gtactcgagc 
gaagagcacc 
cggaaggtgt 
caccgccgtg 
cgcgtactcg 
gaagtgaccg 
gaccccatca 
ggccgcgttg 
tccggcctgg 
gtcccgggcg 
gccacgccgg 
acgaggaacg 
acgagatgcg 
ccgagatccc 
accatgaagc 
ttcagctgct 
agcgagccgg 
gcgcggacgt 
agatacaccg 
atcgcgatgc 
atgtcgccga 
atcgagaagc 
aagtcggtga 
ttcatcccca 



aaccccgcgg 
gtcggcttca 
ccgatgatgc 
aactcctgtt 
ggcacgaggt 
tcgaagacga 
gaagaacagc 
cacggacggc 
cgggttgagc 
cgacgacgcg 
gaagaggttc 
gagcacggcc 
cgcgacgagc 
cgggaagatc 
ggtcacggag 
gaacaccgga 
gacagccgtg 
gtagcgcagc 
ctcggcgagc 
cggcgtcgtg 
gcacgaactc 
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ggtcgcgcgc 
gcggacgagc 
gctccagtac 
ccgctcgatc 
caccttgacc 
catcggcgag 
cacgtagggc 
cgccatgacg 
cgggacgttc 
aaacgccgcc 
ggactcacac 
cccagaccac 
tgtggaagaa 
ccacacgcgc 
gcatgccgac 
tgagcgaccg 
tcacgttgcc 
gagacgccag 
tcatcagcac 
ggaacacgat 
cgccgagcaa 
cgctcggcgt 
cgagcggtac 
accgcacgat 
ccagtctggc 
ggcagaactt 
gacgggattg 
atcaactcct 
aaatgtcatc 
ccgacagaat 
gaaaactgca 
ggcattgcag 
tggagcaggg 
cgggctcggc 
gcgccggccg 
aacgcgctcg 
tccctcgacc 
gcctacgcgc 
gccggggcgc 
gtcgagctca 
ccggacgccg 
gggcgcccga 
gtcgacgcat 
gacgtctcgt 
tccgaggaca 
gcccggctgt 
ccggatctgc 
gcgctcgtcc 
gccgaagcgc 
ctgccgccga 
cagcccgtcg 
ggctacttga 
gatgctcgcg 
ttcgtcgggc 
atcgagaccg 
gacgcgtcgg 
accggagccg 
cgcgcggcgg 
acggggatgc 
atccggtcgc 
ttccgtctcg 
gaccacgtcc 
gaggcggcgg 
gtcgcgcagc 
accgtgcgtc 
atgtttcacg 
cgcgagcgcg 
ttcgacgccg 
ggtgccttcg 



cgctcggcgg 
gcggcgtccg 
tcgctggcga 
cccttctgga 
agcgcgcggc 
tggatcgtgt 
tcgcgcggcg 
tccttggtga 
gcgagcggct 
gtgccgagga 
cgtggccggc 
gacgaacaac 
cccctgcgcc 
gccgttcgcg 
gcgcggatcc 
gatcccgctg 
ttcgcgcagc 
gagctgcagc 
caggaacgcg 
cgcgccggta 
ccccgcgacg 
gtccggcagc 
gatcagaatc 
cacctgccca 
ctgaagcaca 
gggcgccgaa 
gccggctgga 
ccttttgact 
tgtcaggaag 
cgcgtaccac 
ttacaggaac 
gcgctttcgt 
cagggacagg 
gggagctcgt 
taagttttga 
cgcaccgcct 
ggtccatcga 
cgctcgatcc 
ggatcgtgct 
tctgcctcga 
gcgtcgatct 
agggcgtcgc 
cggccgtcgg 
tccaggaaat 
cgcgccggga 
atctcccgtt 
cggcgttgcg 
agtggttcgc 
atgtcgtcac 
ttggcgtgcc 
cgcacggcaa 
atcagccgca 
tgtatcgcac 
ggacggacga 
tcctccttca 



gcagcaagag 
aggcttcggc 
cggactcgac 
cctacaccgc 
atgcgccacg 
cgccccactg 
gggcccacgc 
actttaccgg 
acttgcctgg 
cgggcggatt 
cgtcggtcga 
tgcagcgtca 
tcatggaccg 
acacggagat 



cgacgtcgcc 
atcggttcag 
tcacgcggcc 
tgacgtatcg 
cggtggatac 
cgccttcgcg 
ccaggtagat 
agggcgcgct 
ggcgaagcgg 
aacacacgct 
gcggtcaccg 
atgacgaacg 
gcggcctcgc 
ataatcgcga 
gtgaagtagc 
cacgcctcgg 
accgggatgc 
ggaaatgcga 
agcggaaacg 
agcgaggcta 
agcagcgcga 
gccgagagac 
ccgtgggagt 
taaaggagca 
ctcatagcga 
acacgtaacc 
ggggcctcgt 
gcgtagcacg 
agtaatgtct 
cgcgactggc 
gccaggtcgg 
ctgccggggg 
catggatggt 
ccacctggcc 
cggcgtctcg 
gcagcgtcat 
gctcatcgtc 
tgcgtatccg 
cacctcgcga 
tgcggacgag 
cgaccacctc 
gatgccgcat 
cgccggcgcg 
ctttgcgacg 
tccgcgcgcg 
cgtcgcgttg 
tgaggtcatc 
gcggcatccg 
tgcgcacacg 
gctgcccggc 
cgtgggcgag 
gcgcacggcc 
cggcgatctc 
ccaggtgaag 
gcatcccggt 
gctggtcgcg 
acaggtctcc 
ggatccggcg 
ggcggagatg 
cgaggtgctc 
cgctgtctat 
gacggcctgc 
cgttccggcg 
cgtggactat 
catcttcgtc 
gctcacgaaa 
gatggcgctc 
catccccggc 
gaaccggttc 



gttttcgctc 
ccgcatcgcg 
gtggctctgg 
gttgatctcg 
gggctcccag 
ctggctctgg 
ccggttgatg 
gtcgcgtccg 
gacctgctcg 
gagaatgatc 
ctcggggcgg 
cggcaagaaa 
cgtagtagtg 
tcgggatcgt 
cgtacacgat 
ccacttcgag 
tgagcgcgga 
tctgattgaa 
cgagcaccca 
gtatcgacac 
gcccggcgag 
gcgcccggcg 
agttgtcgtc 
ggaaactcac 
tcgggacgaa 
atcacgctca 
gaatctaccc 
gaagatcatc 
ctggccacag 
atccggcgag 
cagccggttc 
gcggaaccgt 
tccgtgagac 
gtggaccgcg 
ctctcgtatc 
ggcgtcgatc 
gggatcctcg 
aaggaccgcc 
cgcgtgagcg 
cccggggcgt 
gcgtacgtca 
cgtccgttgg 
cgcacgctcc 
ctcgcgaccg 
ctgctgcggg 
cagcagcttg 
acggccggcg 
cagtgcacgc 
ctgactagcc 
gtcgagattc 
atcctcattg 
gaacggtttc 
ggccgtgtgc 
attcgaggct 
gttcgggaga 
tatgtgattc 
ggatggcaga 
cttgccgcga 
cgggagtggg 
gagatcggct 
agcgcgacgg 
ggcgccagcc 
gccgcgttcg 
ctgacgcgcg 
ggcgacgtgc 
gcgccggcgg 
gagcgcgagc 
atctcgggag 
aggtacgacg 



gacggcacgg 
tcgtacacca 
taccagtaca 
atgggctcgg 
ccggcgcccg 
tagtagccga 
tactcgtcca 
gtccaccgct 
gcgcgcatcg 
agtcgtctgg 
aggcgtgaag 
gacgagccag 
cgccgccacg 
cgagagcgca 
gcccagcgtg 
cgtcgtgtgc 
cagcacgccc 
gatgatcgac 
gagcttgcgc 
gcgcgtcaga 
cacgatgacg 
ctcccaggca 
gttgatccag 
ggcggcgata 
tatagcaaag 
caggaagtta 
aatggatgac 
ggcctcttca 
cgccggtcaa 
aaaagtcgcc 
gttttccgtc 
gcgaccgggc 
agccgcgttc 
cggcgcgcca 
gcggcctgac 
ctgacgtggt 
ccgtgctcaa 
tcgagtacat 
gtctgatccg 
tggcggaaga 
tctacacctc 
cgaatctcat 
agttcacttc 
gcggcgagct 
tgctgcgcga 
ccgtcaccgg 
agcagctccg 
tccacaacca 
cccccgcgga 
tgatcgtcga 
gcggcgtgtg 
tctcgcatcc 
gcgacgatgg 
tccgtgtgga 
ccgccgtcgt 
cgtcagccga 
ccgtgtggga 
ccggctggcg 
ccgatgagac 
gcggcacggg 
atgtctctcg 
acgtcgagct 
atgccgttgt 
tcctcgccgg 
cgaacctgcg 
atcttccgat 
tcgtgatcga 
cccaggtcca 
tcgtgctcac 



ggctcgtcac 
tgtagatctt 
gcacgatttg 
tgcggcccgg 
gcaggcagtt 
tgtagagcga 
cgccgagcac 
ccatctggaa 
agcgcgcggc 
tcatgaatct 
cgccgcagga 
cccgagaacg 
cccgtgccgg 
atcgccaggc 
agcagcgaga 
gcgagcacga 
tcgcccacgc 
gggatcggga 
cagccgagca 
aagagctcgg 
agaccgagcg 
aagtaggccg 
tcctgcgcca 
ccgagcggaa 
ggctctcccg 
gcggctaaat 
aaatatccct 
gatgcaggac 
gtgaaatccc 
tcgtcggccc 
agtatttttg 
cgatcacgcg 
gggagacggc 
atttcctgac 
tcgccgcgcc 
ggtcggcatc 
ggccggcggc 
ggtcagggac 
cgtcgatggg 
gggccatgcg 
gggatcgacc 
cggctggcag 
gccgagcttc 
cgtgctcgtg 
tcgatccgtc 
caccgacgat 
ctcgacgccc 
ctacggcccg 
atggccggcc 
cgagcatcgc 
cctcgctcgc 
gcttcgcccc 
cgcgatcgag 
gccgggcgaa 
cgcgcacgac 
tcaggcggcc 
ggacacctac 
gagcagctac 
cgtcgcgcgc 
catggtgctc 
cgccgcgctc 
cctctcgcgc 
gatgaactcg 
cgccgcccgc 
cctgctcgag 
-tgccctcgtg 
tccgatgttc 
gatccggcag 
ggtgggcgga 



21480 

21540 

21600 

21660 

21720 

21780 

21840 

21900 

21960 

22020 

22080 

22140 

22200 

22260 

22320 

22380 

22440 

22500 

22560 

22620 

22680 

22740 

22800 

22860 

22920 

22980 

23040 

23100 

23160 

23220 

23280 

23340 

23400 

23460 

23520 

23580 

23640 

23700 

23760 

23820 

23880 

23940 

24000 

24060 

24120 

24180 

24240 

24300 

24360 

24420 

24480 

24540 

24600 

24660 

24720 

24780 

24840 

24900 

24960 

25020 

25080 

25140 

25200 

25260 

25320 

25380 

25440 

25500 

25560 



WO 2004/013327 



•CT/EP2003/007765 



18/39 



cggccggccg 
cgtcagctcg 
accaacgcgc 
accggcacgg 
ccggaatcgt 
gccgagatcg 
gccgctcccg 
gcgacgaatc 
ctgcaggagc 
ccgctcacgc 
cccgatctgg 
gtctggcaga 
ggcgggcact 
aagcagtggt 
ctcagtgagc 
cgccagcgcg 
taatgcagaa 
tcgaggagta 
aggagctggc 
cgcgcgggct 
gcgaagccga 
tcgaacacgc 
tgagcacggg 
tgctccagcc 
gccgcgtctc 
cgacgtcgct 
cggcgctcgc 
agggcggcat 
cggtcttcag 
acgggcagac 
aggtcagttt 
gcgccgccga 
cgctcggaga 
cggagaacgg 
cggccggcgt 
cgacgctgca 
tgaacgccgc 
cgttcgggat 
agcgaccggg 
cactcgacgc 
tcgccgacgt 
tcgcctgcac 
tcgccaccgg 
gcacgcagta 
aaaccgagcg 
cggatcatgc 
ccgcgctctt 
ccgacgggct 
tgtcgctcga 
cggccgggga 
atggactcgc 
acgcgctcgc 
acatcgcggt 
tcctccgcgg 
cgtggctgac 
ccgtcaggtt 
aagtcgggcc 
cgcacgtcgt 
tgcagcaggc 
acagaggcga 
acctggtcga 
ccgatgctac 
atcgtatccg 
agatcgatcc 
cgagtctgcg 
aggcaccgac 
ttcctgcgcc 
gccctcatgc 
ttcaacagca 



ggcgccgtcc 
ggcaactgct 
ggctgcgtca 
ccggcgagac 
ggcggcgtct 
gcgccatcgg 
tgactcgtca 
cgctccgcga 
gcgtgccgga 
caagcgggaa 
acgtggcgtt 
cgacgctcca 
cgctcctgct 
cgatgatcga 
ggccggacgc 
agatgctgac 
cggcatcgcc 
ctggcgcaac 
gacggccggg 
cgtcccggat 
gctgatggat 
ggcgatcgcg 
gatgacgaac 
ggaggacgtg 
gtacaagctg 
cgtcgcggtg 
cggcggcgtg 
cgggtcgcca 
caacggcgtc 
gatctacgcg 
cgcagccccg 
cgtggcgccg 
tccgatcgag 
gttctgcgga 
cgcggggctg 
ctacgtcgag 
gctgcgacca 
cggcgggacc 
cgccgagccc 
ggcggctgcg 
ggcgcacacg 
cagccgcacc 
cgacgccgaa 
cgcgaacatg 
ctgcctcacg 
ggacgtcgag 
tgtcgtgcag 
catcggtcac 
gaacgcgctc 
gatgctcggc 
gatcgccgcg 
ggcctttgcg 
cgcggcgcac 
catcgcgctc 
cgcggacgaa 
cttcgacggg 
cggacagacg 
tcaagcgacg 
ggtgggcagg 
gggcaggcgt 
atcgccccag 
tcagcagatg 
cgagcggctg 
cgcgctgtcg 
gttcaaggcg 
gatcgacgcg 
ggcggaagcc 
ggccctggct 
attgcagttg 



gtccgtcgcg 
cagcgccgat 
gccggccgcg 
gcttgagagg 
cgcgaacgcc 
cttcgatgcg 
cgcggttccc 
tgcggtggcg 
tcacctggtg 
gatcgatcgg 
tgccaagccc 
gatctcgtcg 
cgcgcaggcg 
gatgttccag 
cggcgccggg 
gcgtcagggg 
ctcgtcggga 
ctcgtcgccg 
atcagccaga 
ccggacctct 
ccgcagcacc 
ccgcggtcgt 
agcacgtacc 
ctgccggcgc 
cacctgcgcg 
gtgcaggcgt 
tccgtgtcgt 
gacgggcact 
ggcatcgtcg 
gtgatccgcg 
agcgtggacg 
cagacgatcg 
gtggctgcgc 
ctcggatcgg 
atcaaaacgg 
cccaacccga 
tggccgcgcc 
aacgcgcacg 
gccagcgacc 
aacctcgcga 
ctgcgcgccg 
gatgcgatcg 
tccggcagcc 
ggccgggagt 
cacctgtcac 
cgggccgcgc 
tacgcgctcg 
agcatgggcg 
gcgctcgtgt 
gtcccgcttc 
gtgaatgcgc 
cgcgtgatgg 
tcgccgatgg 
gacgcgccaa 
gcgaggtcgc 
ctcggcacgc 
ctgtccggat 
ctccggcacg 
ctctggacgg 
cgtgtgccgc 
gcgccggtcc 
acgacaactg 
accgaaatcc 
tttctggaga 
gagttcaagg 
ctcacgagct 
gctccggcgc 
ggcgtgccgg 
atggccgagc 



ctcgactggc 
gcgccggacg 
tgcttcgatc 
gccgccgcgg 
gccggctacg 
cggttcgagc 
gccgcggagc 

cggcgcctca 

ccgtcggcct 
cgcgcccttc 
gccacggagc 
gtgggattgc 
ttcgagcgta 
tacccgacgc 
cagcttgccg 
cctgccctga 
tggccggccg 
gcacggagtc 
gcgatctgca 
tcgacgccgc 
ggctcttcct 
tccccggtct 
tgctgtcgaa 
tgctcggcaa 
gaccgagcat 
gtcagtcgct 
tcccgcaaca 
gccgcgcgtt 
tcctgcgaag 
gatcggcgct 
gccaggcgga 
actacgtcga 
tcacgaaggc 
tcaagagcaa 
cgctcgcgct 
agtgtgattt 
gtggccatcc 
tcgtgctcga 
tgctcgtgat 
gtcatctcga 
gccgacagga 
ccgtgttgcg 
gtgccgtggc 
tgctgccggt 
tctccgccga 
attggctcga 
cccgtcagtg 
agtacacggc 
gtctgagagg 
ccgaggcaga 
caagctcgtg 
ccgaggccgg 
tcgatccgat 
gggttccagt 
cgcagtactg 
tgctcgccga 
tcgcacggca 
ccaaggaacc 
caggggtgcc 
ttccggccta 
cagcgccgct 
tgccttcgtc 
tgcacaaact 
tgggattcga 
tccggatcac 
acatctjacgc 
ctttcgtggc 
cggccgcgcc 
aactccgcat 



gtacggagcc 
cgatcgagat 
tcgtcaagcg 
acacgcgcga 
acgccgagtt 
gccggggcac 
cgcgggccat 
cgcccgagct 
tcgtcgtact 
cggcgcccga 
tcgaacgcaa 
acgacaactt 
ttcggccgct 
tctattcgct 
gcgcgcagga 
gcagagtcga 
ctttccgggc 
gatcacggtc 
gaatcccgac 
gttcttcggc 
cgaggcgtgc 
catcggcgtg 
tctccacagc 
cgagaacgac 
gaacgtgcag 
gctcacgtgg 
ggaaggctac 
cgatgcgaac 
gctcgaggac 
gaacaacgac 
ggtcgtggcc 
ggcgcacggc 
ctttcgcgcc 
tttcgggcac 
ccatcaccag 
cgccaacagt 
gcgacgcgcg 
ggaagcgccg 
ctcggcgaag 
ggcgcacccg 
gtttccgcat 
cggcgcggac 
cttcatgttc 
cgcgcaggtc 
cgtccgccgt 
agctccgtcc 
gatggcgtgg 
cgcgtgcctt 
ccggctgttc 
ggtgcggccc 
cgtcgtttcc 
tgtcgaatgc 
cctggcggag 
cgtgtcgaac 
gacgcggcat 
gccgaatcgg 
gcatccacag 
ggcgccggac 
cgtgaactgg 
tccgttcgag 
tcaagccctg 
tccggcccga 
gagcggcctc 
ctcgttgttc 
gttccgccag 
caagctgcct 
cgcaagcgtg 
cggcacgctg 
gctgggcgga 



ggtcgccgag 
caggaacgtg 
tgccgactgg 
cgcgatcgac 
cacctggtcg 
cgcgtcgctc 
tcagcagtac 
tcggcgattc 
cgatgcgatg 
gagccgccgt 
gatcgcgcag 
cttcgatctg 
cgtgccgggc 
cgccgggttt 
tcgcggacgc 
aggggtccgg 
gcgcggtcga 
ttcaccaacg 
tacgtccgcg 
atcagccagc 
tggctcgcgc 
tggggcggca 
cacgcgggat 
tacctcacga 
accgcgtgct 
cagtgtgatg 
gtctacgtcg 
gcccaaggca 
gcgatcgccg 
ggctcggcga 
atggcgcaga 
accggcacgc 
ggcggcgcca 
ctcgactcgg 
cagattccgg 
ccgttcttcg 
gccgtgagct 
cccgccgcgg 
acgccggcgg 
gacgtcagca 
cgacgagcgc 
gccaaacgtg 
tccggcggcg 
ttccgcgaag 
gcgctgtttc 
gttggtttgc 
ggtgtgcagc 
gcgggcgtga 
gagacgcttc 
gcgctcggcg 
ggcgcgcccg 
cgccgcctcc 
ttcgagcgtt 
ctcagcggct 
ctgagacaaa 
gtgctcctcg 
aaaacggcgg 
gtcttgttcc 
gctgcttaca 
aagaagcggt 
gcgccaaggg 
tcgcgtgccg 
ccggcaggcg 
ctgacgcagg 
ctcttcgaag 
gcggacgcgt 
ctgtccgcgg 
cagcacgtga 
aatcccgccc 
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tgctgcccgc 
cccgcgctac 
aagccagtgc 
acacacggcg 
atccacgctc 
ccgaccggtc 
tgggcgggtt 
aggaacaact 
cggcgctcgt 
aagccgtgct 
tcttcgaagg 
acggcgaatt 
tcatcctgct 
agatcgccgc 
agttcctgca 
aagtcatcac 
cggatcttgc 
gcaatgcgaa 
tcccggaggt 
ccgctgcgaa 
tggcagatcg 
cgccgtatca 
agacggcggt 
gtcccttctt 
tcgacgagtc 
cattgcaggt 
agatctggct 
cgctcgacct 
tcgcgcgtca 
ttccgggcac 
agcacgaggc 
gcgggccgtt 
tggcggccca 
cgacgttcta 
tgagccgctt 
aggcgttctg 
ggccgcggcc 
cgctcgctcg 
tgctcggcgt 
gcatcccggt 
tcgtgccgct 
tgaggacgca 
ccaagctgcg 
tcgagcccgc 
cgcgcgtgac 
tggtcgaggc 
atagctatcg 
tgccggtgat 
ccgactaccc 
cggagggact 
gcgcaaaccg 
cgctgtgcct 
gcgccgcata 
aggattccgg 
ccaggacgcc 
atgatctgaa 
cgacgggaca 
cggtgatcga 
cggccgatag 
tcacctcaga 
tcgacttcct 
cctcgctgct 
tggaagagct 
agaccacggt 
gcgtgccggc 
cgcccgcgcc 
ggtatctcaa 
agccgggagc 
tcgagttcct 



gatgccgccg 
gacggcgccc 
ggagctgtcg 
cacggccgga 
ggtcgccgga 
ggaaggatcg 
cggctcgctt 
ccaccggggg 
caaggaattc 
ggcggccacg 
gtcgtatcac 
gcggtcggcc 
cgagtacgcg 
gatcctcgtc 
cgaggttcgc 
cggcttccgc 
cgcctacggc 
gttcatggac 

gggggtcacg 

ggccgtgctg 
gatcgagcgg 
gattacgcag 
gctgctgttc 
cttcacgacg 
gctggccgag 
gaccgggacg 
cggcggccag 
gcgcgggcct 
cgaatcgctg 
gtcggtgccg 
cgtgcagaag 
gctgcgcgtg 
ccatctcgtg 
ttcggcccag 
cgcgcgtgag 
gatccggcaa 
gccagtgaag 
cggcgccaag 
cttcgagacg 
cgcggtgcag 
tcgcgcgcgc 
gtcactcgac 
gttgccgaag 
catcgtcgat 
atcgaagcgc 
cgactacagc 
gatcctgctc 
ctcggagccg 
gcgcgatcgc 
cgcgctcgtc 
g-ttggcgcat 
ggaacgatcg 
tgtaccactc 
tgccgtgctg 
cgtgctccac 
ggtcgccctg 
accgaagggc 
acggatcggg 
cggtgtgacc 
ggtcgcgacc 
gaagatcacg 
gccgcgccgc 
gcagcgtctc 
cggcgtgttg 
cggcaggccg 
gattggcgtc 
tcgtcccgac 
gcgcatgtac 
cggtcgcgcg 



ccagcgtcgt 
gtctcgaagc 
cccgagcaga 
tcgaagcgac 
ttcaaccggc 
aaggtctggg 
ctgttcggac 
ttcgagctcg 
acggggatgg 
cgttttgccc 
ggtctccttg 
cccgcggcgc 
aacccgaagt 
gagcccgtgc 
cgcgtggcgg 
gtgcatcccg 
aaggtcgtcg 
gcgttcgacg 
tactttgccg 
acgtacatga 
ctcgcgaacg 
ttcagctcga 
ttcctgatgc 
gctcacaccg 
atgcaggcgg 
gcggaccagg 
atgggcgagg 
ctgaatcgcg 
cgcatgacgg 
gttgaatggc 
gtgctcgagg 
accaccatga 
tgtgacggct 
gttcgaggca 
gatcgcgagg 
ttcgacaccg 
tcgtaccagg 
aagctcgccg 
ctcgtctacc 
cctctgctcg 
gtgtcaccca 
ctcaatgagc 
cacccgggcc 
cccgggttcg 
gacctgcacg 
accgatctgt 
gaggccgtcg 
gaccggagtc 
gtcatccatc 
tgtggtgccg 
gcgctcgtga 
gccgggttca 
gaccccgtct 
gtcgtgaccg 
ctcgacgccg 
tcggccgaag 
gtcgtcatcg 
ttacccgccg 
gggctctcgg 
gacgggcggg 
ccgtcgctgc 
tgtctcatgt 
gcaccaggct 
acgtactgga 
ctcgggaacg 
gtcggcgagc 
ctcaccgccg 
cggaccggcg 
gactatcagg 



ccgtcgcggc 
cgcatgcggc 
tggccgcgct 
gcctggccga 
tctggaagga 
acgtcgacgg 
accgttcgcc 
gaccgttgcc 
agcgcgttgg 
gaacggtcac 
acgaagtgct 
ccggcatcgc 
cgctcgaggt 
agagccggcg 
acgagatcgg 
gcggcgccca 
gcggcggcat 
gcggccactg 
gcacgttcat 
agtcgcaggg 
acgcgcgcgc 
tgatgagcct 
gcgagcgcgg 
aggccgacta 
ctggcttctt 
acgtcctgcc 
cggcatcacg 
cggcgatgca 
tggccagcga 
aggacgtgtc 
gcggcgacgg 
cgctcacccc 
ggtcgttcgg 
cgcgtgcgaa 
ccaagcagag 
tccccgagcc 
gcgcccgtgt 
ccgaacaccg 
gactgaccgg 
gcgaggatct 
ccgcgacgtt 
accagaactt 
gcgatccgct 
acggtctcac 
tgaacgtcat 
tcgacgaacc 
tgtcgtcccc 
agcttgtcac 
aactcgttga 
agcggctgac 
agaagggcgt 
tcgtctcggt 
cgcccggcga 
attcgcggtc 
acagtccgcg 
acgccgcgta 
ggcaccgcca 
gctcgagtta 
cgctcgcgat 
cgctcgggtc 
tcgcgtcgct 
tgggcggcga 
gccggatcat 
ccggtgaccc 
tccgcgccta 
tctgtatcgg 
agcgcttcgt 
accgcgcccg 
taaaagtacg 



ggccgacgcg 
cttcaagccg 
cgccgcctgg 
gtaccggccg 
gatgatctac 
caacgagtac 
gatcgtggtc 
gccgctcgcc 
attctccaac 
cgggcgcgac 
cagccggccg 
cggcagcgcg 
catccgcgcg 
gctcgacctg 
cgccgcgctg 
ggcgcacttc 
gcccatcggc 
ggagtacggc 
gcggcatccg 
tcccggcctg 
gatcgtggcc 
gaacttcccg 
gattcatatc 
cgccgccatc 
cggcgcgccc 
gttcaccgaa 
cgcctacaac 
gcgcgccgtc 
accgctcggc 
gggattgccg 
cgtggcgttt 
cgagcatcac 
cgtcatcctg 
cctcgaggcg 
tcctgaagcg 
gctcgagtta 
gtccgtgccg 
cacgaccctg 
ccagcaggat 
ggtcgcgcac 
ctcggagtac 
cacgtacggc 
cgtctcgacg 
ggcgcgttcg 
ggagatggac 
gacggtgcgc 
cggccggtcc 
cggctggaac 
agagcgcgcc 
ctggaccgag 
cgcccccggg 
gcttgccgtg 
tcgcaagtcc 
cgaaggccga 
aatcgcgcaa 
cgtgatctac 
gctcgtgaat 
tgcgctcgtg 
gggctgggtg 
gtatttgacc 
gctcggcgat 
gccgtcacgt 
gaaccactac 
cagggatctg 
cgtgctcgat 
cggcgcgtgc 
ccccgatccg 
gttccgggcc 
cggattccgc 



ggactgaagt 
atcgacggca 
atcgatcgat 
gtgcttgccg 
ccgatcatca 
gtggatctcg 
gacgccgtgg 
ggcgaggtcg 
accggctccg 
aagatcgccg 
ctcgtggtga 
gtgagcgaag 
cacgagtcgg 
cagccgaagg 
atcttcgacg 
ggcgtccgcg 
atcgtcgcgg 
gacgggtcgt 
ctcgcgctcg 
cagacgcggc 
aggcacggcg 
cacgaccaga 
tgggagggcc 
ctccgcgcgc 
gcccactccg 
ggacagcagg 
gaagtcgtcg 
gacgaggtcg 
ctgcgtcacg 
gaggacgctc 
gacgtggaac 
gtcttcgtca 
cgcgatctgg 
ccgatgcagg 
gccgagaccg 
cccgccgatc 
tttgacgcgg 
tttacgacgc 
ttcgtcgtcg 
tgcgtcaact 
ctgaacacgc 
agtttgctcg 
agcttcacgc 
ctgacggttc 
ggcggtctgc 
cggtggctgg 
ctcctgagcg 
gacacggccg 
gtgcagaccc 
ctcaaccgac 
agcagggtgg 
cacaaggccg 
ttcgtcgtcg 
tggctgacgc 
gagtcgcacg 
acgtccggat 
tacacgtggg 
tcgaacgtcg 
ctgcacgtga 
gttcacgaca 
cacccgtcgg 
ccggcgtggg 
ggcccgacgg 
ttcgtcgaga 
ccggccggag 
gtggcacgcg 
ttcgccaccg 
gacgggaaca 
gtcgagctcg 
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gcgaaatcga 
gaaaagatca 
cgatctcgat 
cggcggtgat 
gcgccctgcc 
ccgagacgga 
gcgcgcacga 
tgaaggtccg 
ccatcgccat 
tcacgccgag 
ccacccgcag 
cgtgtttctg 
gcctgaaagc 
cgccgcgctc 
gttctcgctg 
ctccctcgtg 
tgcggccgac 
actcgtcatc 
cgacgggtgg 
cgacgggcgc 
ccagaaggaa 
gctggcgggg 
gacctcccgg 
cgcattcagc 
cgtgctgcat 
cgatcgtccg 
caacgcgtcg 
tgcggcctac 
gcggacccgg 
gaagatgacc 
gtgcgacctc 
cagcgtcgac 
gcttcagacc 
cgacgagcgc 
cgtcagtctg 
gacgctggag 
ccggcggttg 
gtcgttcgat 
gctcgatttc 
catcgtgctc 
ggacgacgag 
cggagaccac 
cgcgatcacg 
cgacgtgcgg 
gtggggcgcg 
ggagacgctc 
gctcttcaac 
gatcggcggc 
cacgcggctc 
gattccgcgc 
tacgcaggcg 
gctctgcctc 
ggaacggttc 
agacctcgcg 
gatcaaactg 
gggagacgtt 
cgcctacgtc 
gaaggcgcgc 
gcgaacggcg 
cgcgcccgag 
ctggagcgaa 



agcggcgctt 
gcccggcgac 
ggcggagctt 
cgtcgcgctc 
ggcgccgccc 
gcgccggctt 
caacttcttc 
cgacacgttt 
gatggcggca 
ggcccgcgtc 
actgagtccc 
gcgccggcgt 
tccgcctaca 
gaacgcgcca 
cagggcgatg 
gacatgcgtc 
gacggcgcgc 
ctcgagccct 
tccggcggga 
gacccgaagc 
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gatcggcatc 
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aaggcgttcg 
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gcaatcgacg 
agggcccgca 
ccgacctgcg 
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tcaacaccat 
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tggtggagat 
atgggagtca 
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tgccctccgt 



agggcgtcga 
tcggctatgt 
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tgtcgctcat 
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Figure 5b 

CDS complement (3. -914) 

/blast__score=2e-66 

/blastp_match="AE004644.PA2177 Pseudomonas aeruginosa" 
/gene="regulator hybrid" 

/note="probable similarity to prokaryote sensory 
transduction proteins" 

/products" sensor /response regulator hybrid" 
CDS 924.. 2168 

/note="none" 
/gene="ligase" 

/blastp_match="AP003013.MLR8297 Mesorhizobium loti" 
/bl as t_scor e=e - 1 52 

/product="2-amino-3-ketobutyrate CoA ligase" 
CDS 2207.. 3190 

/blast_score=e-151 

/blastp_joiatch="AE008872.TDH Salmonella typhimurium" 
/note="none" 
/ gene= " dehy dr ogenas e " 
/product=" threonine 3 -dehydrogenase" 
/pfam_match="PF00107; adh_zinc; 1" 
CDS 3373.. 4455 

/note="putative " 
/gene="methyltransf erase" 

/blastp_match=="AE001866.DR0026 Deinococcus radiodurans" 
/blast_score=le-08 
CDS 4546.. 4959 

/blast_score=2e-ll 

/blastp__match="AP002997.MLL1617 Mesorhizobium loti" 
/gene=" unknown" 

/note="pfam00263, GSPIIJLII, Bacterial type II and III 
secretion system protein. Expect = 7.8" 
CDS 517 6.. 6192 

/blast_score=5e-98 

/blastp_match="AF064070.PE20 Burkholderia pseudomallei" 
/gene="glucose epimerase" 
/note="putative " 

/product="UDP-glucose 4-epimerase" 
/pfam_match="PF01370; Epimerase; 1" 
CDS 6331.. 14043 

/note=" subs t rat AT, malonyl ; zinc depend dehydrogenase ; 
zinc dependent adenosine deaminase putative" 
/gene="PKS I" 

/blastp_match="AF285636.WCBR 2547 Burkholderia mallei" 
/blast_score=0 . 0 
/pfam_match="ketoacyl-synt" 
/pfam_match= n ketoacyl-synt_C" 
/pfam_match="Acyl_transf " 
/pfam_match="SAM binding" 
/pf am__match=" adh_zinc" 
/pfam_match="pp-binding" 
CDS 14275.. 15408 

/blast_score=e-104 

/blastp_match="AF285636.WCBT Burkholderia mallei" 
/gene="acyl-CoA transferase WcbT" 
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/note="putative" 

/pfam_match="PF00155; amino tran_l_2; 1" 
CDS 15436.-16245 

/blast_score=5e-ll 

/blastpjtnatch="TTDEFFMT . FMT T . thermophilics" 
/gene="formyltransf erase" 
/note="evidence experimental" 
/product="methionyl-tRNA formyltransf erase" 
/pfam_match="PF00551; f ormyl_transf ; 1" 
/pfam_match="PF02911; f ormyl_transf_C; 1" 
CDS 16287.. 17384 

/note="putative " 
/gene="glycotransf erase" 

/blastp_match="AF285636.WCBD Burkholderia mallei" 
/blast_score=e- 99 
CDS 17427.. 18158 

/blast_score=2e-82 

/blastp_match="AF285636.WZM Burkholderia mallei" 

/gene="ABC-2 transporter Wzm" 

/note="putative" 

/pfam_match="PF01061; ABC2_membrane ; 1" 
CDS 18248.. 18847 

/blast_score=7e-61 

/blastp_match="AF285636.WZT Burkholderia mallei" 
/gene="ABC-2 transporter Wzt" 
/note="putative " 

/pfam_match="PF00005; ABC_tran; 1" 
CDS 18952,. 20346 

/not e= "putative" 
/gene-"glycosyltranf erase" 
/blast_score=e-101 

/blastp_match="AF285636.WCBE Burkholderia mallei" 
/pfam_match="Glycos__transf_l" 
CDS 20442. .21167 

/note="putat i ve " 

/gene="unknow" 

/blast_score=le-26 

/blastp_match="AE009248.ATU3189 Agrobacterium tumefaciens" 
CDS complement (21164. .24301) 

/note="tranporter domain" 
/gene="unknow" 
/blast_score=5e-2 9 

/blastp_match="AE009122.ATU1658 Agrobacterium tumefaciens" 
/prosite_match="PS00402; B P D_T RAN S P_I NN_ME MBR ; UNKNOWN_l" 
CDS complement (24351. .27023) 

/note="none" 
/ gene= " unknow" 
/blast_score=5e-29 

/blastp_match="AP003581.ALR0267 Nostoc sp" 
CDS complement (27806. .29686) 

/note="none" 

/gene="cell surfarce protein" 
/blast_score=2e-34 

/blastp_match="AE010748.MA0851 2567 Methanosarcina 
acetivorans" 

/product="cell surface protein" 
CDS complement (29535. .30872) 
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/note="none" . . 

/gene="cell surface protein 



CDS 



CDS 



^Sp'2^-«010748.«0851 Metnanosarcina acetivorans" 
rnq complement (30848.. 32647) 

/not e=" fragment" 
/gene="0-antigen" 

/product="0-antigen bxosynthe protexn 
Sara match="PF00535; Glycos_transf_2 
rne . compliment (32574 . . 35555) 

CDS /note="putative» 

/gene="glYCosyltransferase 

/ /Sirst/mrchSE013462.M M 2213 Methanosarcina mazei - 
complement (35533.. 36598) 

;£rstiTar C n^ 7 013462. MM 2213 Methanosarcina mazei" 

/gene="glycosyltransferase" 

/note="putative" 
complement (36516.. 37 400) 

^5^=^003581.^0267 Nostoc sp" 

A TP/GTP-binding protein, esterase fush 
Seguence 37507 bTsSi - -507 C, 13011 O; 5458 X; 0 other; 

ga tecoacga tcggcgccag cggattgcjc ^gcg cjjjcjge jjjjgjtoj ^ 

gtcacgtggc ggccctcggc <*ccagctcc ^catgcg 9 «y cacgacgacg 180 

gtcaccttgg cgaagcccat ^tgccttcca ^cgcgtcg J * c ggcgC ggcgg 240 

rcggcccaga agcgcgtgcc £ccttgcgc acgcgcc g tc ctC gggcgga 300 

tgcagcagcg catcgtgcag ^cggcctcc gggcgg gg gtagC cggt gatgcgctcg 360 

tagaagagcg aaaagtgccg tccgatgatc tcttcgg ^ gat cgc gtagtcc 420 

gcgcccttgt tccagctggt cacgtggccc gcagct g ctcctcgtgc 480 

?gcacgccct cgatcaggag Iggggg ttgtcglagc cgaccagcgt gccgtcctcg 540 

tcgcggcgct cggtcaggtc ^gcgtgacc "9 9 ^ ggccgtcc tt gcgcacgcgc 600 

tcgcgcagcg ccgtgatgat ^"ggcc ^J^J | gC agctcctt ggcgggcoag 660 
cagccctcgt cctcgaagcg gccgtcctcg cgty y y ggC ggccgat gatctcttcg 

rcgcgcgcga tgtcctcctc ggtgta«wc ^?^acg aacgcacgta gccctcggga 780 

gcgcgatagc ccttcaactg "cggcgccc ^gttccacg J && ga tggcgtcc 840 
tcgagcacga agatcgcgta gt^gacg gac 9 gC accgaggg ccgcggaccg 900 
tcgccggcca gcccgctcgg ^gccgaag gcgagg ^ | t cggaC aatca 960 

gatgggcttg gcatggctat tagggtaoaa egg g 9 acgcaggga g ctggggacgc 1020 

gagataattg cttgatgaag atgccatttt ct * tccgca g ggcgccagcg 1080 

?cgaggccca gggcctgtac aagtccgagc ggtgctc^ taactacctc ggcctggcca 1140 
tgcaggtggc cgcccacgac gtgctcaacr w y « gcgcgacggc tacggcatgg 

alcacccg?c cctggtcgcg S^gcgaagg agaegctega g g g acgcgcctcg 126O 

cgtcggttcg cttcatctgc 99^^ J c tgcttcgac gccaacggcg 1320 

cgcgcttcct cggcaccgag ^acgcgatcc ^egg ctcggacgcg ctgaaccacg 1380 

ggctcttcga gaegctgetg 99cgaggagg Igacgaagcg cttccgctac gacaacaacg 1440 

cctcgatcat egaeggegtg ^gcctctcga JWJH ggccggcg cg cgctacaaga 1500 

acatggegag cctcgaggcc ^agctgaagg JJBW J gcgaacctg ggc gccatct 1560 
£S5S ™ar 9 gg rgftggtgga cgactcgcac gccgtgggct 
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tcatgggcgc gcacgggcgg ggaacgcccg agcattgcgg tgtcgaaggg aaggtggaca 1680 

tcctcaccgg cacgctcggc aaggccctgg gcggcgcttc cggcggctac acggcgggca 1740 

agcgcgaggt ggtggcctgg cttcgcaacc gctcgcgccc ctatctcttc tccaacacgc 1800 

tgatgcccgc catcgcgggc gcgtcgctca aggtgctcga tctcctcgag ggcggcggcg 1860 

agctgcgcgc gaagctcgcg cgcaacgccc gccacttccg cggcgagatg acgcgcctcg 1920 

gcttcacgct ggccggcgcc gaccatccga tcatcccggt gatgctgggc gaggccccgc 1980 

tcgcgaagga gatggcggac cggctgctga aggaaggcat ctacgtggtc ggcttctcgt 2040 

ttcccgtggt gccccgaggg caggcgcgca tccgcacgca gatgtcggcc gcccatgaac 2100 

cgaagcacgt cgatcgcgcc atcgccgcct ttgccaaagt cgggcgcgat ttgggagtca 2160 

ttgcttgaga accctttcga agacgaagcg cgagcccggc atctggatgg tcgagtcgcc 2220 

caagcccgtg gtcggccaca acgacgtgct gatccgcgtg aagaagaccg ccatctgcgg 2280 

caccgacatg cacatcttca actgggacga ctggtcgcag aagaccatcc cggtgccgat 2340 

gacggtcggc cacgagtacg taggcatggt ggaggccatg ggccaggagg tgcgcggcct 2400 

gcaggtcggc cagcgcgtct ccggcgaggg ccacatcgtc tgcggccatt gccgcaactg 24 60 

ccgcgccggg cgccgccacc tgtgccgcaa cacgcagggc gtgggcgtga accgccccgg 2520 

cgcgttcgcg gactacctgg tgattcccgc ggagaacgcg ttcccgattc ccgacgacat 2580 

ccccgacgag atcgcctcga tcctcgatcc gttcggcaac gcggcgcaca ccgcgctctc 2640 

gttcgacctg gtgggcgagg acgtgctgat caccggcgcc gggccgatcg gcatcatggc 2700 

cgcggccatc gcgcgccacg tcggcgcgcg ccacgtcgtg atcacggaca tgaacgacta 2760 

ccgcctggcg ctcgccacga agatgggtgc cagccgcgcg gtgaacgtat cgaaggaaaa 2820 

cctgaaggac gtgatgcgcg agctcggcat ggtcgagggc ttcgacgtcg gcatggagat 2880 

gtcgggcgtg ccctcggcct tccgccagat gctcgacacc atgaaccacg gcggcaagat 2940 

cgccatgctc ggcatcccgc ccagcgaggc cgcgatcgac tggacccagg tgatcttcaa 3000 

gggcctggtc atcaagggcg tctacgggcg cgagatgttc gagacctggt acaagatgat 3060 

cgccatgctg cagagtggcc tcgacctctc gcccatggtc acgcaccgct tcgacgtgcg 3120 

cgagtacctg aagggcttcg agacgatggg ctccgggaaa tccgggaaag tcgtcctgtc 3180 

gtgggattag cgagggccac ggatcgcggc ccggttcccg cggcggcgat cgtgcaggcg 3240 

ggttggcagc acgccccggg atttccgccg tggctcacgt gcgacttcgc cgtcgacgca 3300 

ccgatcgcga tcgacgtcga cgtgagcgtg tcgctgctcg atgcccaggg cacctcgctg 3360 

tggacgcagt cgatggccgg cagcgggcgc gacaccccgt ggctaccgac cggcacgtat 3420 

cgcgcctccc tgcgcctgga gccgttccgc ttcccggccg cggccgcctc ggtcgagttc 3480 

gcgctctcga tgcgggtgca aggtgagcgc cgcgtcgtgg ccaccgcagc gggcgtggtg 3540 

cccgcgggcg caccccacgg cgtggcgcgc gcggcctggc acctcgaggc cctggatggc 3600 

acgccggcgc tcgagacgct ggcctggtcc gatcccacgc acagctggtt ctcccagcat 3660 

ttcgaccatg cgtcgcgcac gtgcgtcgag tacctgggcg ccggatggcc cgggtggcgc 3720 

gggcgcgtgc tggatgcggg atgcggcgac ggcatcacgg cgctcggcat cgcgctgcgc 3780 

tatgcgcccg agcaggtggt gggcatcgat ccggggcgct gctaccgggt gcttcccgac 3840 

atcctcgagc gccacgccct cggccacctc gcgctgccgg acaacctgca gttcctcccc 3900 

gcggatgcca acgcgctccc gttcgccgac agcagcttcg acgtcatcgt ctcgtggtcc 3960 

gcggtcgagc atttcgtggg cggctacctg ccctcgctcg ccgaggcgcg gcgggtgctc 4020 

aagcccggcg gcctgctggt catccatccc gagctctact acaccgcgca ccacggccat 4080 

cacctcggcg agtacagccg cgagcccttc ttccacctgg tgaagtcgcc cgacgaggtg 4140 

cgcgacatcg tgttcggcgc ggacgtgagc ctccacgacc gcggcggcat cgcgccgacg 4200 

cgcgcggagc actggcgctg gtacaacgag ctcaatcgca tcacggccgc gcagttcgag 4260 

gacgagttgc gcgcgctcga cttcgagccg tggcgcctgg cactgcgcgc cgagccgctg '4320 

gtggagtaca agccggagct gctgcgctac cgcttcacgg agctgggcgt ctccgagctc 4380 

tacgtggcct gcatcaaccg caagcccgtg tcagatggtt cctacaggac gcatgggcag 4440 

gcggcattcc ggtaaggtcc gccgggacgg gaattgaagg cgcaagcgcg ttgttatccc 4500 

cgtcactccc ggactgccgg gatcaccctc acaggagaac ccccgatgaa gacccagacc 4560 

caggccgtgc tcgccgcggc cctgctgtcc cttgccggca ccacgctggc ccgcgccgaa 4620 

gccgtcacgc tgacgggcgc gagcgaagtc ccgcccgtca cctcgagcgc gaagggcagc 4 680 

ggcacggtgg tcgtgaaggc cgattgcacg gtcaccgcga agatcaccgt caccggcatg 4740 

accgccaccg cggcccacat ccacgaaggc aaggccggcg ccaacgcggg cgtggcagtc 4800 

ccgttcgtga agaccgcgga cacacggttc gaggcagccc cgggcgccaa gatgaccgag 4860 

gcgcagtgcg ccgcgtacaa ggcgggcggc acctacgtga acgtgcacag cgaggcccac 4 920 

aagggcggcg aggtccgcgc ccagctggcc ggcaagtagg gcagccgctg caaaaagaaa 4980 

aagcccgcgc aagcgggctt tttcgttgaa ggctttggtc ggggcgaaag gattcgaacc 5040 
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ttcgaccccc 
tgaaattata 
tagcatccgc 
atcctctgcg 
agctcgccgc 
gacgtggaca 
agcgtgatcc 
taccacgaca 
gtgcgcaagt 
atcgacgagg 
gagcacatgc 
tacttcaacc 
cccaacaacc 
gtctacggca 
tgcgacctgg 
accgtcctca 
accttcgagc 
gacgtcgcga 
aagcgcggaa 
ggctacggtt 
agggggcggt 
ccggcgttgg 
ccgagtaaga 
agcaacacct 
cgcgaatccc 
gcgcgctggg 
tcgttcgccg 
tccccgcgcg 
gaagccctcg 
ttcgtgggcc 
gacgcgtcgt 
tacgacctcc 
ttccaccagg 
atcagcctgc 
aagcgcggcg 
gcgggcgtgc 
gccgtcgtcg 
agcgtcgatg 
tccgagatcg 
acccgcgccc 
ggttcggtga 
aaggcgctgc 
aaccccaaca 
catccgtaca 
cacgtgatcc 
gtggcgctca 
gcgaacttcc 
cactgccgcg 
gccggcgccc 
gcgctgccca 
gcaatggggc 
gacgcgctgt 
cccgccgacc 
ggcttgaccg 
gtgggcgagg 
gtgatctacc 
gtcggcctcg 



tgcaccccat 
cctgcatcgg 
ccatggcgac 
cactggccca 
gttccatcga 
tccgcgacgc 
acctcgcggg 
acaacgtccg 
tcgtcttcag 
acgcgcccac 
tggtcgccct 
cggtcggcgc 
tcatgcccta 
gcgactaccc 
ccgaaggcca 
cggtgaatct 
gcgtgaacaa 
tctgctacgc 
tcgaggagat 
gacggattgc 
tcttgtaaga 
aaggaagaag 
atcgtcagtc 
cgcatcgcgt 
tctggccctc 
cgaaggagcc 
ccggcaccat 
aggccgcgca 
aggccgcggc 
tgtccaccgt 
cggccaccgg 
acggccccag 
cctgccagtc 
acctgcatcc 
cctgccgcgt 
tggtgctgaa 
ccgcgagcgg 
cccaggctgc 
actacctcga 
tcggcatcgc 
agagcaacgt 
attgcctcaa 
tccacttcga 
agaagctcgt 
tcgagagcac 
tgctgagcgg 
tgcaggcgag 
actggcacca 
tcgcgcaatt 
acgcgtccgg 
gcgccctgct 
tccgcccgct 
agctggaccg 
agctgctgcg 
tcgccgccgc 
accgcagccg 
ccgaggacgc 



gcaggtgcgc 
ggtcgcagcc 
gaccctcgtg 
ggccgggcga 
acgcgtgctg 
cgacggcatc 
actgaaggcg 
cggcaccgag 
ctcctcggcc 
ctcgccgcag 
cgccaagcgc 
acacgagagc 
cgtgtgccag 
caccccggac 
cgtggccgcg 
cgggacgggg 
actaagcatt 
gaacgcgggc 
gtgccgcgat 
aaatcatggc 
tccaaggttc 
gcaaccctag 
gagaccaact 
cgccgtcgtc 
gctcctcgag 
gtactaccac 
cggcgacatc 
gatggatccg 
ggtgaagccc 
cgactacagc 
caacaccgcg 
catggtcgtg 
gatcgccatc 
gtacgggttc 
gttcgacgcc 
ggacctcgac 
cgtgaatacc 
attgctgtcg 
ggcccacggc 
gctcgggcgc 
cggccacctc 
gcaccgcgcg 
cgagtggaac 
cgtcggcgtg 
gcccacttcg 
ccacgacgac 
cgagcaaccc 
gctccgcgcc 
cgccaagggc 
ccccgcattc 
cgccggcgag 
cggcggctac 
caccgagatc 
ccactggggc 
ttgggcgagc 
ctaccagggc 
cgcgcgccgc 



taccaggctg 
gccatgaatc 
acgggtggtg 
cgcagcatct 
cagatcgcac 
cgcaaggtga 
gtcggcgaat 
agcctgctcg 
accgtgtacg 
agcccgtacg 
gacccctcct 
gcgctgatcg 
gtcgcctccg 
ggcaccggca 
ctggaagcac 
ctcggctaca 
gcccggcgca 
cgcgcgaagg 
gcctggcgtt 
cttggcgacc 
gggatgcgtc 
tgttgtcatg 
cgcctccggg 
ggcattgcgt 
gggcgcaacc 
ccgcgcaagt 
agcggcttcg 
cagcagcgcc 
tcgaccctgc 
tacggcctcg 
agcatcgcgg 
gatacggcgt 
ggcgagacgc 
atcgcgttct 
agcggcgatg 
cgcgccgtgg 
gcgggccgct 
gatgtgtacg 
accggcaccg 
ctgcgcccgc 
gaggcggcct 
gtgccgccca 
ctgaagccgg 
aactcgttcg 
cccaaggcgg 
gcggcgctgc 
gcgctgcacg 
atggtggtcg 
cacaacacgc 
gtgtattcgg 
ccggtgttcc 
tccatcctcg 
gcccagccgc 
atccagccca 
ggcgcgctga 
gagaccaagg 
cagatcgagg 



cgctacgccc 
gaaacgcccc 
caggctacat 
gcatcgacaa 
ccggttgcgt 
tcgcagggcg 
cggtcgagca 
cggcgctcgc 
gcctcgccga 
gccagaacaa 
ggcgcgtagt 
gcgaggaccc 
ggcgactgaa 
tgcgcgattt 
tcgagcgcgc 
ccgtgctgca 
tcgtggggcg 
ccgtgctggg 
ggcaggaaag 
cttgctgcaa 
caggacgtca 
gctccagacg 
tgcaacgaat 
tccgccttcc 
tcgtcaccac 
ccgagcccgg 
acgcgtcgtt 
tgctgctcga 
gcggcagcga 
cctacgacct 
ccaaccgcct 
gctcgtcctc 
gccaggcgct 
ccaaggcgtc 
gctacgtgcg 
ccgacggcaa 
cctccagcct 
agcgcgccgg 
tggtgggcga 
gccaccagcc 
cgggcatggc 
ccatccacct 
tcctcgagcc 
gcttcggcgg 
tcgaggcccc 
gcgacgtggc 
acatcgcgta 
gcaccgaccg 
ccggcgtcgt 
gcaacggcgc 
gtgccgccgt 
agatgctgga 
tgctcttcgc 
cgcgcgtcac 
cgctcgacga 
gcgcgggcgc 
cgctcgggtt 



cgaccgaact 
gggcttcggg 
cggcacccac 
ctactccaac 
cgaggccttc 
cgacgtggac 
gcccgagcgc 
cgactcgccg 
gaagatgccg 
gctcgacatc 
gaacctgcgc 
ggccggggtt 
ggagctgagc 
catccacgtc 
ggcccccggg 
attgctcgaa 
gcggccgggc 
ttggaccgcg 
gaatgcaaag 
tgcggcatga 
aaaacagcac 
ggattgcgcg 
cgtgagccga 
cggcgctacg 
cgtcgatccc 
caccagctac 
cttcggcatc 
gctctcctgg 
ttgcggcgtg 
ggcgtccatg 
ctcgtatttc 
gctggtggcg 
ggtgggcggc 
gatgctctcg 
ctcggaaggc 
cccgatcctc 
caccgtgccc 
catcgacccg 
tccgatcgag 
gctgccgatc 
gggcctggtg 
cgattcgccc 
gctcacgctc 
cgccaacgcg 
ggccggcgag 
ggcctcgtac 
caccaccgtg 
cgccagcatc 
gagcggccgt 
gcaatggccg 
gcaatcggtc 
gcaggaggtc 
cgtgcaggtg 
cggccacagc 
tgcggcgcgc 
gatgaccgcc 
gaagggcgag 
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gtgaacgtcg cgtgcatcaa cagcacgcgc aacgtgacgc tcgccggcac gcgcgccggc 8520 

atcgaggcgc tcgaggccga gctcacgcag cgcaaggtgt tccaccgccg cctcgacctc 8580 

gactacgcgt tccacagccc ggcgatggac ccggtgcgcg acgggctcgt gagcgcgctg 8640 

cgcggcctca cgccgcgcgc cacccgggtc gcgttccatt cggcggtcac cggctcgccc 8700 

gccgcgggca accagctcga cgccacctac tggtggagca acatccgcga gccggtgcgc 87 60 

ttccaggccg cgatccgcgg catcgccgag tcgggcgtca atgtcttcat cgagatcggc 8820 

ccgcacccga tcctcaagaa ctacatcaac gacggcctgc gcgccgcctc gatcgagggc 8880 

cgcgcgctgg tcacgctgca gcgctcggcg agcgatcgcg cggcgatccg cgccgccgcc 8940 
caggaagtcg tgatcaccgg ctgcccggtg gagaccgcga agatgctccc cgcgcagggc 9000 

cacttcgtcg agctgccgcc ctacccgtgg cagcgcgagc gccactggcg cgcgcccacc 9060 
tcgcaggcct acgacctcat ccagcacggc aagcagcacc cgctgctggg ctatcgcctg 9120 
cacgagaacg acttccagtg ggaaaaccac atcgacaccg cgctctatcc ggcgtacgcg 9180 
gaccatgtcg tcggcggcgc cgtggtgttc cccgcggccg gcttcgtcga gatggcgctc 9240 
gccgcctcgg cgatcgcgct gggcggcgac gcgcacgaga tcgagacgct cgagatccgc 9300 
agcgcgctgc tgctcgagga ctccacgtcg aagaccgtgc gcttcgcgct ggaacccgat 9360 
ggccgcttca caatccgcag ccgcgctcgc ctgagcgagg atccgtggca actccacgtg 9420 
gtcggcaagc tggtcggcac gcccaccgag cttccgcgcg cgccgcgcca gtcggcgccg 9480 
tcgcgcaagc ccgatgtgat cgcggcccag cactacgaag gcgccgcgaa ggccggcctc 9540 
gcctacggcc cggccttcca gtcggtctcc aaggtgtggc tcgacgccgc ggacccgggc 9600 
agcgcctacg cgcgcctgat cctccccaag ccgattcgcg gcgagctggg cgtgatgcac 9660 
ctgcaccccg cgtcgctgga cggctgcttc cagctgctgg tcgacctgct gcgcgccgag 9720 
gcctcccggc acgcgcagat cgccttcgtg ccgatcctgg tcgggcgcac gcgcctctac 9780 
ggccctgccg ccctcgtcac cgcctgccgc gtccgcctga ccgcgcgcag cccgcgctcg 9840 
ctggtcgcgg acttcgagct ctacggcccc aacggcgacg tggtggccgc gctcaccggc 9900 
gtgcgcttcc gcggcgtcca gctgaagcag ccgcagggcg cgcgcctgcg ctacctgtcg 9960 

taccgcgcca tcccgcgcct gaatggcggc gacacgccgg gcgccaagcc gctgcccctc 10020 

gcgttgctgg ccgcggcgtg cggcgagcgc tggcattcgc tctcggccgg ctcgcgcaag 10080 

cgctactacg atgaggtcga gccgctctcc gacgtgctgt gcagctcctt cgccgagcac 10140 

gcgctgcgaa agcggatcgg cgaggccccg ctcatcgacc cggccgcgct cgccgcgtcg 10200 

acgccggccg agcattggcc gctcctcaag cgcgcgatcc agatgctggt cgaggaccag 102 60 

ttgctcgagc ccgccgaagg cggctggcgc tggagcccgg tcaccgaatt gcccgacgcg 10320 

caggagagct ggatcacgct gctgcgcgac tacccggacg aggccggcca gctgatcgcg 10380 

ctcgggcgcg ccggcgctca cctcgcggaa gtgttcgccg gcaaggaagc gtccttgccg 10440 

agcctgttgc cggccacgca ggcgaacgtg ttcgcgcgcc tgtgctcggg cgcacccgcg 10500 

ttcgccaacg ccgggcgcgc catcgccgac acgctggcgc tcgccctcga gcgcctgccc 10560 

gcgcaccggc cgctgcgcgt gctcgaggtg acgccgtgcc gctccgagat cacgctgcgc 10620 

gtcctcgagg ccgtggacct cgaccgctgc gagtaccacg tcggctgcac cacggacgaa 10680 

gcactgggcg agtacgaagg cgtcctcgat gccattccgc acgtcgagac ttgcgtcgtg 10740 

gacctgaaga tcccgggcat gggcctgaag gcgccgaacc agggcccgtt cgatgtcgtg 10800 

atcgtttccg atggcctgct cggtgcggcc gatccggatg cggccgcggc gcacctcgcc 108 60 

ggcgcgctcg cggaagacgg gctgctggtc atggtcgcgc agcatccgtc gcgctggacc 10920 

gacctcctct tcggcgcgga cccggcgtgg tggacgctcg ggccctcggg ctcgcagcgc 10980 

tcgcgcctgc gttcgccgga cgagtggcgc atcgcgctgg cccacctcgg cttcgggccc 11040 

accgtcgcgg tgcccgagct gcccggcctc gagcgcggct cgtacgtcct catggcgcgg 11100 

cgcgacggcc cggtgccgca ggctgccgag cgcagcgttc ccgccaagtc gggctggatc 11160 

ttcgtgcagg aagcgggcgg ttattcggcg gcgttcggcg cctgcgtgca ggcggatctg 11220 

cgcgcacgcg gccacgcggt gttcgagatg gcgcccgatt ccgtcgaggg ggccgagatg 11280 

gcgcagcttg cgcgcgcgcg ccagacgctg ggcgacatca gcggcatcgc cttcctggcc 11340 

ggcctgcccg acgacgcggc cgccacgggc gacgcggccg agcgcctcgc gcgccagacc 11400 

gcgcgctgcg ccgcgctcgc gcgcctgctg cgcgaatgca aggcgagtgc catgacaccc 11460 

gcggtgtgga tcgtgacctc gggcgcggcc tccagcctgg cgcgcaacgt ccccgccctt 11520 

gcctcgcgcc gccgccctcc cgatccggac gaggcgatcg tatgggggtt cgcgcgcacc 11580 

gcgatgaacg aattccccga gctgcgcctg cgcctcgtgg actgcccgaa cccggagaac 11640 

ctcgagcgca acgccgcctc gctggtgcag gagctgctct cgcccaccgc cgacgacgag 11700 

gtggtcctca ccggcgccgg ccgctacgcg ctgcgcctgg gcgcggtgcc gccgcccgca 11760 

tccgtgcgca ccgctaccca gggcatcacg cgcctcgact tcgccgcgcc cggcccgctg 11820 

aagaacctcg cgtggcgcgg cgacgtgcgc cgcaagccgc gcgcgaacga ggtggaaatc 11880 
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gaggtgcgcg ccgcgggcct caacttccgc gacgtgatgt acgcgatggg cctgctctcc 11940 

gacgaggcgg tcgagggcgg cttcgccggc gcctcgctcg gcatggagct ctcgggcgtc 12000 

gtcaccgccg tcggcaagga cgtgacctcg gtcgcgcgcg gcgacgaggt gctcgcgttc 12060 

gcgccctcgg cgttcgccac gcacgtggtc accaccgccg attcggtcgc gaagaagccc 12120 

gccggctgga cgttcgagtc cgcggccacc atcccgaccg cgttcttcac cgtgtactac 12180 

tcgttgaagc acctggccca gctgcgcgag ggcgagcgcg tcctgatcca cggcgccgcg 12240 

ggcggcgtgg gcatcgccgc gatccaggtc gccaagtggc tgggcgcgga gatcttcgcc 12300 

accgccggct ccgacgagaa gcgcgacttc gtgcgcctgc tgggcgcgga ccacgtgctc 12360 

gactcgcgca cgctcacgtt cgccgacgac gtgctgcgca tcaccaacgg cgagggcgtc 12420 

gacgtggtgc tgaactcgct ctccggcgag gccattgcgc gcaacctgcg cgcgctgcgc 12480 

ccgttcggcc gcttcatcga gctcggcaag cgcgactact acgagaacac gcacatcggc 12540 

ctgcgcccgt tccgcaacaa catctcgtac ttcggcgtcg atgccgacca gctcatgaag 12600 

gagcggcccg acctcgcgcg ccagctcttc aacgagctga tgcagctctt cgagcagggc 12660 

gtgctctcgc cgctgcccta ccgcgccttc gccgccaccg aggcggtgga agccttccgc 12720 

tacatgcagc actcgcgcca gatcggcaag gtggtgctgt cgttcgccga cggcgtgaag 12780 

gcgcagcccg cccggcccgc cgtgccgcgc gagctggcgc tgtcgccgaa cgccacgtac 12840 

ctcgtcaccg gcggcctgtc gggcttcggc ctgcgcaccg cgcagtggct ggtggacaag 12900 

ggtgcccgcc acctggtgct ggtctccaag agcggcgccg agtcgctcga gaccaaggcc 12960 

gcggtggccg acctcgaggc ccagggcgcc acggtgatcg cggaggcgtg cgacatcacc 13020 

gaccgcgcct ccgtgcaacg gctgctcgcc gaggtcgcgg ccgcgctgcc gccgctgcgc 13080 

ggcgtgatcc acgcggcggc cgtgatccag gacggcttca tcaccaacat gacggccgcg 13140 

cagatccgcg acgtgctggc gcccaaggtc ctcggcgccc gccacctgga cgagctcacg 13200 

cgcggcgcca agctcgattt cttcgtgctg tactcgtcgg cgacgacgct gttcggcaac 132 60 

ccgggccagg ccaactacat cgcggccaac tgcttcctcg aggcactcgc caaggcgcgc 13320 

cgcgtcgagg gccagccggc gctgtgcgtg ggctggggcg cgatcgggga cgtgggctac 13380 

ctcgcgcgcc acgagaaggt gaaggaagcc ctgcagacgc acatgggcgg caccgcactc 13440 

gagtcggagg ccgcgctcgc ggtgctggag cagctcctgc tcgccgacgc ctcgggcctc 13500 

ggcgtgctcg acttcagctg gcgcaacctc cgccgcttcc tccccaccgc cgcctcgccg 13560 

cgcttccgcg agctgggcgc gcgcggcgac gacgcgaagg acgacgacga ccggcgcggc 13620 

gagctgcgcc gcctcgccaa cgagctggac gcctccgagc tctcggcgat gttcaccgac 13680 

ctgctgcgcc gcgaggtggg cgagatcctg cgcatcccgc cggaccgcct cgatacgcgc 13740 

cagcccttgc aggaaatggg catggattcg ctgatgggcg tggagctgct gaccgcggtg 13800 

gaggcgcgct tcggcgtgaa cctgccggtg atgtcgctct ccgagcagcc gtcgatcgag 138 60 

aagctggtcg accgcatcgt gcgcgcgctg aaggacccga acgccggcac cgaggccgaa 13920 

tcgcgcagcg accagatcga gcgcgtcgcc gcgcagtacg cgcccgagct cgactcgcgc 13980 

caggtcgagg agctcacgca ggcggtcgat gacgcgcaag ccacgcagca gggacgtccg 14040 

tgaaggcgcg caacctccgc ggcctctcgc aggcagtgaa ggagcagctg atccagcgcg 14100 

tgctcgagca gcgcgtgcgc cgggtcgaga aggacaagcc ctcggcggtc gagtccgcca 14160 

tcaacgcctt ccgccccgac cccgatgccc tgggccacat ctcggagcgc tactgccgct 14220 

tcgacatgca cccggtctac cagcagatgc agctggtgaa ggaaggtgcc gcgaagctgg 14280 

gcattcccga tccgtactte cgcgcgcacg acggcgtggc cgcggatacc acgcgcattg 14340 

ccgggcgcga gtacatcaac ttctccagct acaactacct cggcttctgc ggcgacccgg 14400 

ccgtggatgc cgcggccaag caggccatcg acgaatacgg gaccctcggt ctcggccagc 14460 

aggctcgtgt ccggcgagcg gccgctgcac cgcgagctgg agcgcgccat cgccgaggtc 14520 

tatggcgtgc acgacgccgt ggccttcatc agcggccacg cgaccaacgt ctccaccatc 14580 

ggccacctgt tcgggccgcg cgacctcatc gtgcacgacg cgttcgtgca caacagcatc 14 640 

ctgatgggca tccagctctc gggcgccaag cgcatggcct tcccgcacaa cgactggcgg 14700 

gccctggacg agctgctgcg cgcgcagcgc cgccacttcg agcgcgtgct gatcgtgatc 147 60 

gagggcgtct acagcatgga cggggactac cccgagctgc ccgaattctc gcgcctgcgc 14820 

cgccgccacc gcgcgttcct gatggtggat gaagcgcact ccttcggcgt gatgggcccg 14880 

cgcggcttcg gcatccgcga ccacttcggc atggccggcg acgaggccga catctggatg 14 940 

ggcacgctct ccaagacgct cgcctcgtgc ggcggcttca tcgccggcga gaaggcgctg 15000 

gtcgagcacc tgaagttcgc cgcccccggc ttcctctaca gcgtcggcat ggccccgccc 15060 

gtggcggccg cggcgctcgc cgcgctgaac cgcatgcgcg aggagccggc gcgcgtggcc 15120 

gccctgcagg cgcgcggacg cttcttcctc gaggccgcgc gcgccgcggg catcgacgtg 15180 

ggcctctcgc agggcatggc cgtggtgccc gccatcaccg gcagctcgat ccgcgccggg 15240 

cgcctggccg aggcgatgtt ccagcgcggc atcaacgtcc agccgatcgt ctacccggcc 15300 
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gtgcaggaga actcggcgcg cctgcgcttc ttcgtgagcg ccacgcacac cgaggagcag 15360 

ctgcgcttca cggtccgcga gctcgccgac gcctggcgca agctctgagt ggcgggcccc 15420 

cggctgcgcg tcgtcgtctg cacccacggc ggcctgaacg gcgcgctggt gctctcgcgg 15480 

ctgctggccg cccccacgct cgaagtctcc gcgctggtga tctccagccg cgcgcgcggt 15540 

gcccatgagt cgtatggccg tgccgcgctc ggatacgtgc gcgcgagcgg cgtcgcctat 15600 

gcgctgtacc tctggtgcgc cacggcgctc gccgacctgc tgctgcgcgg cacgtccgag 15660 

ggcccggtcg cgcgcatcgc cctcgcgcgc ggcatcccgc tcctcgccac gccgcgcgtg 15720 

aacgatgcca ccgcgcgcgc cttcatcgcg ggggcggcgc cggacctgat cgtctccgcg 15780 

ttcttcaacc agcacatcga cgccgacgtg gctgcgctcg cgcgcgtggc cgcggtcaac 15840 

atccacccgt cgccgctgcc gcacttccgc ggcgtggatc cggtgtcctt cgcgcgcctg 15900 

cgcggtgccg agcgccacgg cgtgagcgtg catcgcatcg aacccggctt cgataccggc 15960 

gcgctgctcg cccaggaaac cgacgtcgag gcccccggca gcgtctttgc cgccaccgcg 16020 

gcgctctatg accgcggggc ggcgctcctc gccggccgtg ccgcggcgct ggccgccgac 16080 

ccgcgcggaa ccccccagcc ggcggggggc tcctacgact cctggcccac ccgcgcccag 16140 

gtcgccgcct tccggcgcgc cggcgggcgc ctgctccgcg cccgcgacct gtggcgcctc 16200 

gcgcgccggg ggccggccgc tttcgtaata gaatcagcgc ggtagcggcg actcacgccg 16260 

cgcttttccc cgaccccccg aggcccatga aacactggct gaagcaacac gcaatcttcg 16320 

cgctcaccgt cctgctgccg accgtggccg cgatcctcta cttcggcctg atcgcctccg 16380 

acgtctacat ctccgaatca cgcttcgtgg tgaggagccc ccagcgccag gtgcagaccg 16440 

gcctggtggg cgccctgctc tcgggcaccg gcttctcgcg ctcccaggac gacacctact 16500 

cggtgcacga cttcatcacc tcgcgcgacg cgctgggcga gctggacaag aagctcgccg 16560 

tccgcaagct ctacacggcc gccaacatcg acttcatcaa ccgcttcccg gggctcgact 16620 

gggacgacag cttcgaggcg ttccaccgct actaccagaa gaaggtcacg atcgacttcg 16680 

acaccgcgtc ctcgatcacg gtgctgcgcg tgcgcgcctt cgagaaggcc gactcgcggc 16740 

gcatcaacga cctgctgctg cagatgggcg agcgcctggt gaacgagctg aacgagcgca 16800 

gccgccagga cctgatccgc ttcgcgcagg ccgaggtctc gctcgccgag gacaaggtga 16860 

aggatgccgc gctggcgctc tccgccttcc gcagcaacca gtcggtgttc gagcccgacc 16920 

gccaggcctc gatccagctg cagggcgtgg ccaagctcca ggaggagctc atcgccaccg 16980 

aggggcagct ggcacagctg cgcaagctct cgcccgacaa cccgcagatc ggcgcgctcg 17040 

agaacaagtc ggcggcgctg cgcgtggcga tggcgcgcga gtccgccaag gtgaccggcg 17100 

gcagcggctc gttcagcgcg cgcgccccgg cgttcgagcg cctcaccctc gagaagggct 17160 

tcgccgaccg ccagctgggc gttgccctca ccgcgctgga gaccgcgcgc agcgaggcgc 17220 

agaggaagca gctctacctc gagcgcatcg tgcagcccaa cctgcccgac gaatcggtcg 17280 

agccgcgccg catccgctcg atcttcaccg tgttcgtgct gggcctggtg gcctggggcg 17340 

tggtgagcct gctggtggcg agcgtgcgcg agcacgtcga ctaggccgtg gagcagaacc 17400 

cttccctcct gcgcgccctc ggggtgcagc ggcgcgtgct ctacgcgctg ctgatgcgcg 17460 

aggtcatcac gcgcttcggg cgcgacgacc tcggcgtgct gtggctggtg gtcgagccga 17520 

tgatcttcac gctcggcgtg acggccctgt ggaccgcggc cggcatgaac cacggctcgt 17580 

cgctgccgat cgtggccttc gcggtcaccg gatactccgc ggtgctggta tggcgcaact 17 640 

gcgcctcgcg cgcctcgatg gccatcgagg ccaacaaggg cctgctcttc caccgcaacg 17700 

tgcgcgtgat cgacgtgttc gtgacgcgca tcatcctcga gatcgcgggc gccacggcgt 177 60 

cgttcaccgt gctgggcatc ttcttcatct ggatcggctg gatgccgatg ccgatcgaca 17 820 

tgctcaaggt cgcgttcggc tggttcatgc tgtcgtggtt cggcgccagc ctggcgctcg 17880 

ccatcggcgc cggcaccgcg tactcgctgg tggtcgagcg gctgtggccg ccgacggcct 17 940 

acctgctgtt cccgctctcg ggcgccgcct tcatggtcga gtggctgccg gagaaattcc 18000 

agcagttcgt gctcttgctg ccgatggtgc acggcaccga gctgctgcgc gacggctact 18060 

tcggcaacgc ggtgcgcacg cactacgacg tgggctacat ggcaacggtc tcgctgggcc 18120 

tgtcgctgct cggcatgcac ctggtgcgca tggcggccaa gcgcgtggag gccgcgtgat 18180 

ccgggtcgag aacgtgacca gatgtactca cgcgcctcgg cccagaacgt gctgacgcgt 18240 

cagcttcgag ctcggccgcg ggcgcaacat cggcatcctc ggcaagaacg gcgccggcaa 18300 

gtccacgctg atccgcctga tcagcggcgc cgagagcccc accaccgggc gcatcgtgcg 18360 

cgagatgagc gtgtcctggc cgctcgcctt cggcggcgcg ttccagaccc acctcaccgg 18420 

gctggacaac ctgaagttcg tctgccgcat ctacggcgtc gactaccgcg acaaggtcgc 18480 

cttcgtcgag gacttcaccg agctcggcgt gtatttccgc gagccggtgt tccactactc 18540 

gcacggcatg atcacgcgcc tcgccttcgc gctgtcgatg gcggtggagt tcgactgctt 18600 

cctgatcgac gaggcgatgg tggtgggcga cgcgcgcttc cacgagcgct gccacgtcga 18660 

gctcttccac aagcgcaagg accgcgcctt catcctcgtc acccacgacg ccaaggtgat 18720 
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caagctctat tgcgagagcg cctgcgtgct gcacgagggg cggctgctgc ccttccccac 18780 

cgtggacgcc gcctacgagt tctacatgaa cgaggtcatg caggacctcg ccccggaggt 18840 

tgcctgaacg cccccgcgcg gccgctgcgc gtgctcctcg agctgcggcc ggcgttcgac 18900 

ggccacgccg gcatcccgca ggaggcacgc ctgctgttcc gcggcctgcg catgatcgag 18960 

ggcatcgacg tggagggcct gctgcagcac agcggccacg tgctcgccaa gggcctgccg 19020 

ccgcgcggcg gcggcgacct cgcaccggac cgccagctga accgcctgtc gcgcgtcgtg 19080 

gtctccacca agcaggagct caccaacgcg cacgtggcga cggcgctgat ggccattcgc 19140 

cagctgctgg gcgcgcgcga gaccctctac cgcttcgacg cggcgcactt ccgcgacttc 19200 

gtctggcagg cgctcttcgc gcgcacgctg cacgccgcgg acttcgactc ggtcacccgc 19260 

gccggctttc gcgtcgcgcg cgtgccgtgg acgggcatgc accgctgcgc gctggtcacg 19320 

cgcaagctcg gctacacgct gcatccgcgc atcgacactt ccgatttcga cgtgctgatc 19380 

gccgagacgc cctacccggc gcgggtcgcg gacggcacgc gcctcgtggt gcgctaccac 19440 

gacgcgatcc cgctcctcat gccgcacacc atctccgaca tgtcctacca ccaggcctcg 19500 

cactaccggg cgctgcgccg caacgtggcc tcgggcgcgc acttcgcgtg cgtctccgaa 19560 

gccacgcgca aggacctgct ctcggtcttt cccgaggtcg aggcgcgctc cagcaccatc 19620 

cacaacatgg tctcgcacca ctacttccgc gagaccgccg acgccgggcg catcgaggag 19680 

atcctgcgaa cgcgcgcgag cgagcgcatc aagggcgcgc ccaacggagc cgccaccgcc 19740 

gtcgcacgat cgccctcctt cctcgcgcgc gccgcgcaac ccagcgaccc cggctacctg 19800 

ctcgccgtgg cgacgatcga gccgcgcaag aaccacgcgg gactgatcgc cgcgtgggag 19860 

cacctgcgca cctcgcgctt cccgggcctg cgcctggtgg tggtcggcat gcccggctgg 19920 

ggctacgagg cgctggtagc caagttcaag ccgtggctcg cgcgcggcca gctcttcgtg 19980 

ctcgaggacg tgcccgcccc cgagctgcgc ctgctctact cgcacgcgcg tgccacggtg 20040 

tgcccgagct tcggcgaggg cttcgatttc tccggcgtgg aggcgatgcg ctgcggcagc 20100 

ccggtgatcg cctcggagat cgccgcgcac cgcgaggtgt accgcgacgc cgccgagtac 20160 

tgcagcccct attccgtggc cgacctcgcc gaggcgatcg ggcgcgtcat cgacccggcg 20220 

gcgacgggcc tgcgccaggc gctggtcacg cgcggcaccg aggtctcgca gcggtacacg 20280 

cccgaagcca tcctgccgca gtggcgcgag tacctgctcg gcacggtgcg cgcggaggcc 20340 

ccgtgagcga tccgaaggcc accgaccagg cgtgggagga gtggggccag cgcgacccct 204 00 

acttcggcgt gatcaccaac ccgcgcttcc gccgcgggca gatggacgag gaagcgaagc 204 60 

gcgaattcct cgcgtccggc cgcgtccatg ccgactacgt gatgcgcatg gtgcacgcgc 20520 

acatcgcgcc cgacttccgt ccgcgcacca tcctcgattt cggctgcggc gtgggccgcc 20580 

tcgtcatccc cttcgccgcc caggccgagc aggtgaccgg cgccgacgtc tccccctcga 20640 

tgctggccga ggctgcgcgc aactgcgccg agcagggcgt ggccaatgcg cgcctcgtgg 20700 

tctcggacga ctcgctcacc gggctgccgg gccccttcga cctcgtgcac tcgttcatcg 20760 

tgttccagca catcgatccg gcgcgcgggc gcgacatctt ccgccgcctg ctgggcaccc 20820 

tggcgagcgg cggcgtgggc gcgttgcact tcgtctatgc gaagcgtatc tacgcggcga 20880 

cgtacggcgt ggcgccaccg cccgagccgc cgccgccccc gccgccgccg cagcctccac 20940 

ccagccgggc ggagatcaag gccgcgagca aggcgcgggc cgcgctggcc acgcaggcac 21000 

gggcgccgga accggcaccc aatcccgacc ccgacatgca gatgaattcc tacccggcgg 21060 

cggagatgct gttcctcgtg caggaagccg gcgtcacgcg cttccacgtc gagttcaccg 21120 

accacggcgg cgagctgggt ctcttcctct tcttccgcaa gccctaggcg tccttcgagg 21180 

cgaaggagag gcccagcgcg gtgatcgtcg cggcgtgctg cggggcgacc gccgccgcgg 21240 

ccgcgacttc ggcagccacg cgctcgaggt cgccgcgcga cccgtacagg cgcacgcgcc 21300 

agtagtgcac ccagtaggag tccggacagg ccgacgcgat tcccagcgcg cgctcgacct 21360 

cgggcccgac gcgcccctgg tgcacttggt gcgaggccca ttgcaggttc acggagggct 21420 

cgaggggcag gacggcgagc agcgcctcgc tcaccgccag cgcgcgcgcc tcatcgccgt 21480 

cgtcggcgag gcgcatcagc gtgggcaggg actccggcgc aagcgcgtgc gcggccgcga 21540 

gcgaatcgcg cagcatcgca gcgtcgccgc gcgcgagcgc caggcgggcg cggtggtagt 21600 

gggtccagaa cgaaggcgcc gcggcatcga gcgcgcgcgc gacctcggcg cttgcgtcgc 21660 

ccccggagag ccgggacgcg gcccacaaca ggttcgcctc cgccgcgtgc gccggcacgc 21720 

tccgcaacgg cgccagcacc gcggtggcga cctcgcgcga accctcgcgc gacgagaagg 21780 

ccgagcgcgc gaactgcatc agcgattcgc cgagcgacga atgttcggcg cagagccgcg 21840 

cgggatccac cgcgccgagc gcgcgggcgg cacgcagcgc cgcctcctcg tcaccgcgaa 21900 

gcgcgtgcag ccgcaggcga tggaacgcgg cccactccgc cgacatcgac gccgccccgc 21960 

gctccagcgc accctcgacg gcgccatcgg tgcgcccgcg cgcgatgtgc aggatggccc 22020 

aggtcgcctg cgcgccgggg aggttcgtgc cttgcgcgac gagcgggtcg aggaccgcga 22080 

gcgccgcggc ggcgtcgcgc gcgccactga agtcgttggc caggccgatc agcgccccgg 22140 
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^ raca ccccaccgcc gcgaggtcgc 
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ttccagcccg aggccgtggt gcccgcgtcg ctgaacacca ccgtgccgtc gatcaggaac 25620 

tcgaacttgc cggcgttggg gaagctcgac acgcgatagg cgaacgcgac gttgcccgcg 25680 

agcagcgtgc ccgcgaacga gaggtcggag ttcaccgtcg tcgagttggt cgggtcgctc 25740 

gtcaccacct gcgccgaacg aaggctggtc gcgccctcga aggcctggtc ggagccgacc 25800 

gtccatgcgt tggcgccgcc cgaggtcgtg aagccggcag gcagcgtgcc acccgtgggc 25860 

cacggcgaga ggatgatgcc ggtcgcggtc gccgggccgc cgatgaagac acccgtggaa 25920 

cccgtcgcat tggagagcgc gacgttgaac gtctcgttgc cttcaccgat accgtcgctc 25980 

gccaccggga ccgtgatggt cttgggcgtg gtctcgccgt tggcccagct gagcgagccc 26040 

gaggtcgcgg tgtagtcggc gcccgaagtg gcggtgccgt tggaggtggc gtagctgacc 2 6100 

gagatggcac ccgccgagcc gccgatgcgg ctcacggtga gcgtcacgtt gccggcggtt 2 6160 

tccgcggccg cgaaggtggt gccggtgaac tgcaccgaac cgggaacggc gggcggggtg 26220 

accgtggact tgagcgccga gagtgccgcg cggttgttgt tcagcgagag cgcgtcgttg 2 6280 

gcctgcgcgg tgccgcacgg ttgcggcgca ccgacacccg tcgtgcacga gaggcccggg 2 6340 

ttggagaagc ggtagacctt ggtcgtcgag tcatgcagca gcgacatgat ggtggtgaag 26400 

ttcgccgcgt cgttcgtcgt gcactcgggg cgctggctca cgccgcaggc gccgccgccg 2 64 60 

gagccgttgg gcagggccga gttgcaggtg aggttgccgc tcgcgcaata gaagtggccg 26520 

tacgagtacg cgaacgcacc ttgcggcggg ctcgcggtgc cgcccgcttc ccaggcctgc 2 6580 

gaggggcggt cgtggcggtt gcccatggca tggccgagct cgtggatgaa gacccactcg 2 6640 

caaccgaaca ggcagccgga agcaacggcg taggcaaagt tggggtccgg cgtgctggtg 26700 

ccgatccagg ctgcgccgtt gccgccgaag tcggcgccgt tgcgcatgaa ggccaccatg 267 60 

tccgcgccgt attgcgtgcg gatggcctcg atgttgccga acgtcgcggc gtcgaagctc 2 6820 

gcgtcgcccg gggtgatggc atgcatcgcc gtggtgtcgt cgatggcgtc gtcgtagttc 2 6880 

acctgcgtcg cattcaccag gcgcagcgtg atggcgacct cgctgtcggc atacgcagtg 26940 

ttggcgcgcg tgacgaggaa gttcaggcgc gtcatcaggc tgccgccgag gcgctggccg 27000 

aagccgcgcg agtacacgat catcaggtcg atgacgttct gcggcgtggg cgtcacctgg 27060 

gcgagccgac ggtattcacg ccggggatgg cgatcagggt ctcggggctg ccggtcgcga 27120 

cgacgcgcga cttcgagccg gtcgtggcca cggcgtggtc ggcgccgagg ctgatgatcg 27180 

gcatgctggc ctgctcggcc gtcatgtcga ccaggtagtc gacggtgccg cccggaatca 27240 

ggcggtactc accctgcggg gtcgagaaca cgccgaagga gccttcaggc ccggtcgtga 27300 

cgatcgcgcg gtggttgatg ccgtcggtct tgctcttcgc gatccacgac gtgatgccgt 27360 

cgccgtgggc ctggagcaac tcgaagacat acgagtggcg caccccgttg gggagcgaca 27420 

gctcgacctc ggagcgcgga ctcagggcat ggagggccgc ggcgttgaag cgcaccgctt 27480 

gccgggcaac ggcgctgccc ggcgccggca tcgcggcggc gggtgcggag accaggatct 27540 

ggggaaccgc ggcgaaggcc acggaagaga agacgatcgc cagcgcgccg aggaaggcgg 27 600 

ctgtgatccg ttgcgtgaat gcgttcatgt ggagctccgg aagttgaccc atgcccaatc 27 660 

cgctatgtcg cggagatgtg gacaaaaggt atcgagcggg cgtgacgacc cgcccccgga 27720 

gggatgctcc aaaaggacta cgagggtgct acggctgggt tggtggcgaa ggccgaagga 27780 

gaactccttg tggtctccgc gcgacttaag gttgcgaggg aaccaccgac cacccaccgg 27840 

ggccgatcaa gtccgcgctg ccgatcgtgg tgacgccgtt catcaggcga atcgtcacgc 27 900 

gcccatcgtt gtggtggaag accacgtcca tgcgcccgtc gccgttgaag tcgaggagct 27960 

gcgtcaccgt ccagcccgcg aacggcggca ggatgtcggc ggcgccgagg atcgcggtgc 28020 

cgttcatcag gcgcacgtgc gcgcggccgt cggtgtgcgc gaacacgagg tccgcccggc 28080 

cgtcgccgtt gagatctccc accaggttca ccgaccatcc ggtgcccgcg ggcaacagct 28140 

cggcgctggc gccgaacgtc gtgccatcca tcaggaacag gtgtgcgcgg ccgtcgttgt 28200 

ggcggaagac gaggtcggcc ttctggtcgc cgctgaagtc cgcgacgtgg ctcaccgccc 282 60 

agccgctgcc gggcgcgagg aagcccacgc cggcggtgat cgccgtgccg ttcatgatgt 28320 

atgcgtagcc gcgcccatcg acgttggaga agacgatgtc ggccttgccg tcgccgttca 28380 

tgtcgccggt gccgaccacg ttccatcccg agcccgcggg cagcaggctc gcactgccgc 28440 

tgatcgcggt gccattcatc agccagatgt gggcgcgtcc gtccgtgtgc tgagcagcag 28500 

gtcggccttg -ccgtcgccgt tcaggtccgc cgtgtggctg atggtccagc ccgtgccggc 28560 

ggggaacagc tccttgccgc ccacgaccgt gagcccgttc atctggtacg tgtagatgcg 28 620 

cccgtcggtg tgctggaagt agatgtcggc ctggctgtcg ccgttggaat ccgcggccat 28680 

tgatcacgct ccagccggcg ccggcgggaa tcaggttcgc cgaaccagtg atcgtcgttc 28740 

cgttcatggt ccacgcggcg atgcgcccat cggtgtgctg gaacaggaag tcgctctgcc 28800 

cgtcgccgtt caggtcggtc gccgcatggg tcgggaacgc caccttcgcg aggaaggcat 28860 

cggagaaacc cgtgagcaac gtcttgtacg cgccgggcgt ggtcggatag ttcgccgacg 28920 

acgtgcggcc ggccacatag gccgcgcccc gcgcatcgac gccgacgccg taccccacct 28980 
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cgacgttgtt gccgccgatc agcgtcgaga acacgatgtc gccgttgcgc acgcgagtga 29040 

ggaacgcgtc ggcatcgccg ccggggccct gcaccgggtt cagggtcggg aagccggcga 29100 

cctgcgtgta gcccacgacg agcgcatcgc cattcggcgc cagggcgacg tccaggggat 29160 

tgtcgtaggc cgagccgccg aagatgcccg acgattcgag cgcgccggtg gccgagaagc 29220 

gcgtcacgaa cgcatcgacg gtgccggcga aattgcgcac cgttccggtc tgcgggaagc 29280 

tgggcaggcc ggtatcgccg acgacggtcg cctggcccgt gaccgggttc accgcgatcg 29340 

cggtcacgtt gtcgccgctg ccgccgagga acgtggaata cgcgaccggg ccgccgctcg 29400 

caccgatctt ggtgacgaag ccgccaccgg gttgggtctg gtacgcattc acgttgcgga 29460 

acgtcgagga gcgcgtggag cccgcgacgt acgtgttgcc cgcattgtcc gttgcgatac 29520 

cgcggccgat gtcatcaccg ctgccgccga gatacgtgga atagacgaag ccgccgttga 29580 

acgagagctg gtgacgaacg cgtcgcgatt gaggccggtg ccgacgaaca cggctgcagc 29640 

gcgctgccgc cggtcgggaa gccgcgcgcg gccgaacccg tgacatgcat gaagccgttg 29700 

ttgtccacgg cgatcgccat gccctcgtcg tcgccggagc cgcccagcag ccgcgagaag 297 60 

atgacgttgc cagtggtatt cagcgccgtg acgaatgcat cgcgcccgcc ggcatggttg 2 9820 

gtgcaatcgg tgacggcgaa gcaatgcggg aacgcgttgg tgccggtgga accggtgatc 29880 

gcgacgaact cgttgccgcc ggaggcgccg agcgccagcc cgcggagata cgcttcggcc 29940 

ccgagcgtgg tcgagtacac cacctgctgc gtggtggggt tgaacttcac caccgtggct 30000 

tcgtacgtgc ctgcggtata ggtcgcgagc ccgacgtaca cgttgccctg cgcgtccacc 30060 

ttgaccggcg tggtccactg ctcgtcgccg gcgccgccca ggtacgtgga gaacgccatc 30120 

accgggtcga tcaccagcgt gcgggtgctg tcgtagtcgg cgatcacgaa gccggcctcg 30180 

gcgccctgct cgcccacgaa gagctcgaag cgcgcggcca ccggcacgcg cgaatcgccg 30240 

acctgctgga aggctaccgg cgcgtgctgc gtaaattcct cctggcccac gcgcatgcgc 30300 

aggttgccct cgccatcgat ccacgcgtgg tcggcggacg agaggtcgag gcggatctgg 30360 

cgcggatcgg cgcgcggggc gaccacgaag tcgtattcga gcgtgccctc cttgccgtac 30420 

acggtgaggt ccacgccgcg atacaggtcc ttcagcgaca cgcggccgaa gtgcggcacg 30480 

ttttcgcgcc ggccggcggc cgtggtgccg ctgtaatagt ggctgagggt ggcccggggc 30540 

tcctcggcct cgatgaccgg atcggaagcc cccgcgaagc gcacgcgcaa cagcgaggcg 30600 

ctggcagctt tcctggagcg tggctcgaag gcaatgccat cgcggctcac ggagacgcgt 30660 

ccggcctgcc cgcgcgagac gtagagcgca tccgggccga actggccctc gttgcgctcg 30720 

aacgtgatgg gcgcgttcct gagggcatca gggacgacgg catggacggc ggcgggagtg 30780 

gcgagggcgg ccgcgacgag cacggccagg gggcggagcg ccgggcgggc gccgctggag 30840 

ctgcgggtca tggggcagtg acctttctgt tgttatggct gcttgttgtc tttgcgggat 30900 

ccacagctcg gatccagctg gcgatactac cagcgagcga gtggtttttc gtgacccagt 30960 

cccgcagtcg ggccccggcg acaacggcgg catcgcggtc ctttgccagg gcacgcacgg 31020 

cctcgatcca gtcgccgtcg cgggccagca cggcgggggc atcgcggtag ggctcgagat 31080 

ccgaggccac caccgggatg ccgaggcagc cgtactcgag cagcttcagg ttgctcttcg 31140 

cgcgattgaa cgggttgtcg cgcagcggtg ccaccgcgac gtcgagtgcc agcgacgcga 31200 

gcttctccgg gtactgcgcg atgggcacca tgtcgtgcac ttccgcggcg aacggcgcca 31260 

gctcgggcgt gcacaggccg aggaacaccc agtcgatctc gcgatgcgtg gcgcgcacca 31320 

cgggctcgag caacttcagg tcttccccat gctgtttcgc ccccgcccag cccacgcgcg 31380 

gacgcgcgcc acccgcggga cggttcgcga ggccggccca acgctccgca tcgatcgcgt 31440 

tcgggatcac gcgcacgtcc ttcgcgccgc ggccgaaggc ctccgcgagc ggtgcggtgg 31500 

acaccacgag gcggtcgcac aacgccaccg cgcgcgcgat gcgctgggcg atgtccgggt 31560 

agatcgtcgc cgcatacgga ttgccgggcg gcagttgcgt gagcaggtca tccaggccca 31620 

ggaccttcag cgcgcgtccg tggcgcgcga gcacctcgag cgaggtgagc tggtagttgt 31680 

ggaagaagtt gtgcgcgagc acggcgtcgg cgtcgaggcg ctgccactcg acgcggttgg 31740 

gcgcgcagcc cgtggcgtgc tcgcccatca tcacgacctc gacgcggccc gcgcgctcca 31800 

gcgccgcgca cggctggcgc acgcgcacct cgccggagcc ccagcggtcg aacggaaacg 318 60 

cgcagagccg aaccgcgtgc gtgtcgttgc cgcgcggcgc gaagcgttcc acgggcccgt 31920 

agccctcgcc gcccatgcgc aacgccgggt ggtagtgcgg atcgtcgtcg agcaccggct 31980 

gccagcgctc gcgcatccac gcctcttcgg aaggaatgac gagcttcgcc accggaacga 32040 

cggcatcggc gcgcgcatcg gcgagcaacg ggaagtccgc cgcgatgccg gggcgcgcaa 32100 

ggatgtcgaa gcccgctgcg cgcagcccga ggcacaggtg cgccatcgcg aacggccccg 32160 

cgcgctcgat ctcgtgcagc gcctgcgcgg agagcgccgc gtcgcggttg accagcgcga 32220 

gacggcccgc cgcgacgctc acctcgcgcg gcgatgcgta gagtgccgcg aggcgctcct 32280 

cgccctgcgc gcgcggcgga gcgcccgaga cggcccacgg cccgccgccc gcgatttccc 32340 

agccgggaac gcgcacaccg ccggggccca gcaggtcggg cgcgagggca cccgtggagg 32400 
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ggcctgcgat 
catcgacgat 
cggtttcgcc 
gccagcgttt 
cagcccttca 
atgaaggcgc 
gcagctggca 
tggtggcgct 
acgagcagct 
gcgagatcgg 
gcgcggcgca 
gtggggacga 
tagagaagct 
tcggccagcc 
gcgcggttga 
agggcgatcg 
gcatcgccgc 
aacaacggat 
tcgcgataga 
ttcaggttct 
gccacgatcg 
cgctcgggaa 
ctcaacacca 
cgaaggatgc 
ttcgccgccg 
cccgccagcc 
ccgaggaagg 
gcaatgccgt 
atgcgcatcg 
gacgccgcgg 
acgatcgcct 
gcggcgcggc 
tacgagaagt 
aggcgcaggc 
agcagttcgc 
acgcggatgc 
aggcgctcgg 
gcgacgtcct 
gagtcgaggc 
gccgcggcgt 
acgcccgcga 
gcgtcgatgc 
tcgcagcacg 
ccgtagccgc 
accagcatgc 
gcgacgcagg 
gcgacgccgg 
cgcgcatcgc 
tccgaattga 
gcgaagccga 
ctgcccagca 
tgcgcggcgc 
ttgaagaccg 
atcgcgccag 
cgtaggcgcc 
gctcgtggcg 
ccgccgtcac 



gccctgctcg 
cgcgatccac 
gcccaccgga 
cggcgccctt 
cgagcgcctc 
aaatcgtgcc 
ggaacgcggc 
cgcgcgcggc 
cgcgcagtgc 
tgacgcccgc 
gcaccgcgag 
acgccggatc 
cggcgttcgg 
ggtcgccctc 
cgcgcccgag 
cggcagtgct 
ccagcggcgt 
cgatcgttgc 
gcgccaggta 
ccgcgacgcg 
cgccgggatc 
tggcgccatc 
ggcagaaggt 
gcggcaggtc 
cttcgagcgc 
ggcgcgcggc 
ccacgcgcag 
gcccgatgac 
ccgcgtactc 
cgaagcgcga 
cgctgcggcc 
cgcacggcag 
cgtgcgccgt 
tgttccagcc 
ggaaggccgc 
cgggcgtcat 
tggtgaagtc 
tgcgcagcag 
gatgcagcac 
tgacgcgctc 
cgtaggtggg 
cggccacccc 
cgatctgcca 
ggccatagag 
agaagccgac 
cggagatggc 
gcaggctgag 
tgtcgcggca 
gcaccacgaa 
ggttcttcgc 
acccggcgag 
ccgtgtggcg 
ggacgacgac 
caggcgctgg 
tccatgcgct 
accggagacg 
cgcctcgcgc 



aggcgctcga 
agcgtcgccg 
accaggtgcg 
cgcgtcgaga 
gatgcacgcc 
ggccccgatc 
ctgcgtgcgc 
gtcgctttcc 
caccacggcg 
ccagccgccg 
cccggcgagc 
gtagcgttcg 
gttgcgtgcg 
ctccaccagc 
cgccgcgtcg 
cctggccggc 
gcgctcgcgg 
gggcacagca 
ctcggccgca 
atcgagcgcc 
gttgtgcggc 
gcgcgtgccc 
ctcctccatc 
cgtcgcggca 
gagatcgcgc 
ctcgatgatc 
caccggccgc 
gcgcagcttc 
ggacgggcag 
ttcgaggtag 
gcgcaggcaa 
gtcctgcggc 
gatgaccacg 
caccaacgac 
ctccacgccc 
gcgctcgggg 
ggtccacgcg 
gttcaccagc 
ctgcagcacg 
ggcggcgcgc 
ccagcggcgc 
gccgaacgag 
gtcccggtcc 
cgagtcgaac 
cgccgtcggc 
atcgacgtcc 
gatcgtcgcg 
acgcagcagg 
gtcgccgccg 
attggtgacg 
gcgcggatcg 
ggccagcgac 
tagcgccggc 
tgctcctgcg 
cccaggtgcg 
tcgtggtcgc 
tcgccgcaca 



gccagccggc 
cgttgcggat 
tgcgcacgcc 
tcgacgtaca 
accagccgct 
gcgcgcgccc 
gtgcggcgga 
tgcgcggcga 
ctcgcacggc 
gcgcgcagcc 
ggctggtgtc 
ccgtgcgcgc 
accgcacccg 
acgacccatt 
ctcgcgtcgc 
aaggcctcgc 
cggcgtggaa 
tccaccgcgc 
ttcgcctcga 
gtgcggtcgt 
agcaggaagc 
gccaccggga 
gtcgagggca 
taggcgccgc 
ggaccccagg 
aggtgcgcgc 
gcttcggggc 
gcttcgcagc 
aggatcacgt 
ccggtcacgg 
tccacgcatc 
ccggtcatca 
cgtgcccccg 
tggaagtgca 
gcgtcggcaa 
ttcaggcgcg 
cgcgggccct 
gcgcgcgtgt 
cgcgggcggc 
cgcagcgggt 
tcgaggcgca 
gcctcgcccg 
agcgcgcgca 
gggccgcatt 
aggcgcggat 
ggctcgccgt 
cggttggaga 
ccctcgagcc 
ctgccggcca 
acacgagtgc 
gtgctcgcgt 
gcgaggcagc 
gccatccgac 
agagtgcctc 
tgcgcgcctc 
ggtcggagag 
gggcgaggaa 



gcggaattcg 
ggcgcgcgcg 
gcgcccgccc 
cgtggccgat 
ccggcgcgtg 
acgatgcggg 
agtgctccgc 
tgttcaggtc 
cctcctccgc 
cgccacgctc 
gcgcgaactc 
tcagggcatc 
ccaccgcgct 
cgcccttcgc 
gctcgaggac 
gcgccatcca 
agccgaggca 
catcgcgctc 
tcgtcttcgg 
cgcgcaggcg 
cggtctcgcc 
tgccgaggcg 
gcaggaccag 
gctgcatgat 
cctcgatcgc 
ccttgcgccc 
gcggaagcag 
tcgcgccgaa 
cggccgcgtc 
acgccggcgc 
cgccatcggc 
tgttgtagtc 
actcacgcgc 
cgacgtcgta 
ccgccgcctc 
cgagtcgcac 
cgggcagcac 
gctcctcaat 
ccggcatzccg 
tcgcgatcgc 
cgcggttggc 
cgtggtgcac 
tcgagaggtc 
gcacccacac 
actcgcggcg 
cgggccgctc 
gcgggcacac 
aaccggggcc 
ggcccgcagc 
cggcacgacg 
catcgacgag 
gttcggtgtc 
cggcgcttcg 
gagctgcttc 
ctcgagctgc 
cagcgagaag 
gtacatcgca 



ctgcaacgcg 
aacgcggagc 
agctcaccgg 
acgttcgcgg 
cgtcgcgtca 
cgcatcggac 
gaccacgcgc 
gaggcgatgc 
gaggcgcagc 
cagcaacgca 
cgcgccgaac 
actgcagccg 
cggcgcgcca 
gtccgcgatc 
cacggtctcc 
gtggtcgtag 
cgccgcggcg 
cgccgccagc 
gcggtggtac 
cagcaggagg 
atcgcgcacg 
ctgtgcctcg 
gtcgatgccg 
gcccgccgca 
gatgcgctgc 
gccgaagcgg 
cggcaggtcg 
gcccttgcgc 
cacggcatcg 
cgtcggcccc 
gccgcgcgcg 
ggcgcacagg 
gatgcgcggc 
gcccccggac 
gtgcccgatc 
gtgcgtcgcc 
cagcgtcgag 
gccgccgcgc 
ctcgcgctcg 
ccacgcgcgc 
ctcgcgcgcg 
gtagacgtcg 
gttttcctcg 
ctcgcggcgc 
gctggcgcgc 
gaagatatcg 
gatgccgacc 
gacctgcgta 
gaccgcgcct 
cgccgcgaag 
gacgatgccg 
gtcgtggccg 
agctgggcca 
tgcgaggcga 
cgcagcgact 
cgcccggccc 
tcggcgacgc 



32460 

32520 

32580 

32640 

32700 

32760 

32820 

32880 

32940 

33000 
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33120 

33180 

33240 

33300 

33360 
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33480 
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33780 

33840 
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35040 

35100 

35160 

35220 

35280 

35340 

35400 

35460 

35520 

35580 

35640 

35700 

357 60 

35820 



WO 2004/013327 



CT/EP2003/007765 



34/39 

ccgccgcggg tttcgcagcc tccggcgccg aagcttgcag caactgcacg gcgtcgggca 35880 

ccgcgtgcag cggccagatc gccgaatacg catcgacgcg ttgcccgaac agggccacct 35940 

ccgggaaccg cgccttcagc acggcgagga attcgtgctc gtagagctcg cgcacgtgat 36000 

gcgggttgcg gtagtcgcgc tggtccgagt acacctcgcg gttcggcgtc gagaccagca 36060 

gcaggccgcc gggcgccagc acgcgcttcg cttcgtcgag caggcgctcg ggatcgggaa 36120 

tgtgctcgag cgtctcgaag gagacgagga gatcgatgct tgcgtcatcg cacggcagcg 36180 

actcgcagcg gccctcgacg tactgcaggt tctgcgcggc accatagcgg cggcgcgcct 36240 

gcgcgatggt ctcggcggcg acatccgcgc ccaccacgga cttcgcgcgc gtcgccagca 36300 

gcgccgagcc gtagccctcc ccgcaggcga tgtcgaggac gcggcagcct ccggcgagcg 36360 

gcaacgcaaa gtggtagcgg tgccagtgct cgtaccagat ctcgccctcg aagccgggct 36420 

ggaaacgttc gtgttccatc gggagagtat aggtggatgt gcaaccgccc tccggagagg 36480 

gcggttgcga tgatcttatt cgagatcggc tgctacggaa gcaccggcaa cggctgcggc 36540 

ggcgccaccg accatccggt gcccgcgccc agcaggttcg cgctgccgag cgtggtgagg 36600 

ccgtccatca ggcgcaccgt gatgcggcca tcgacattct tgaacaccat gtcgagcttg 36660 

ccgtcgccat cgaagtcgag cagttgcgtg acggtccagc ccgcccccgc cggcagcacg 36720 

tccgccgcgc tgaggatggc ggtgccgttc atcaggcgca cgtgcgcacg gccgtcggtg 36780 

tggcggaaca cgatgtcacc gcgcccgtcg cggttcatgt cgcccacgtg gctcacgatc 36840 

cagccggtgc ccgcgccgag gagctcggaa ccggcgccga acgcggtgcc gttcatgatg 36900 

aagagatgcg cgcggccatc ggtgtggcgg aagaacatgt ccgccttgcc gtcgccgctc 36960 

acgtcgccca ggtgcgtcac cgtccagccg ctggcgggcg agaggaagcc cgcgccggcg 37020 

gtgatggtgg tgccgttcat caggtagatg tagccgcggc cgtcagcgtg gatgaagacg 37080 

atgtcgtccc tgccgtcgcc gttgaggtcc ccggtggcca cgactctcca gcccgtcgcc 37140 

ggccccagca gctgctggct gccgatgatg gccgtgccat ccatcagcca caggtgcgag 37200 

cggccgtcgg cgttgcgcag caggagatcc gccttgccgt cgccgttcat gtcggcggtg 37260 

cggtcgatgc tccagccgag gccggcgccc agcagctcct tgccgccggt caccgtgagg 37320 

ccgttcatcg tgtacacgta cacgcgcccg tcggtgtggt gggatcctct agagtcgacc 37380 

tgcaggcatg caagcttgag tattctatag tctcacctaa atagcttggc gtaatcatgg 37440 

tcataqctgt ttcctgtgtg aaattgttat ccgctcacaa ttcacacaac atacgagccg 37500 

gaagcat ~ 37507 
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SEQUENCE LISTING 



<110> LIBRAGEN 

w . * th . -voression of unknown environmental DNA 
<120> Method for the express 

into adapted host cells. 



<130> B0149WO 

<140> 
<141> 

<160> 16 

<170> Patentln Ver. 2.1 

<210> 1 
<211> 37500 
<212> DMA 

<213> Artificial Sequence 
<220> 



,vn n f Artificial Sequence: DNA sequence 
<223> Description of Arcixicj-ax ^ 

of clone FS3-124. 

<400> 1 , „ na etattacgcc agctggcgaa agggggatgt gctgcaaggc 60 

gatcggtgcg ggcctcttcg <*»"»°*~ agtcacgacg ttgtaaaacg acggccagtg 120 
gattaagttg ggtaacgcca gggttttccc agtcacga g « tcC cacgtac 180 

aattgtaata cgactcacta *™ g cggcgtcgtg 240 

cacgagctca tctggaagag cccgctccag aca g gg gacgatgaag 300 

agaccgtaca cgcagaagcg tctcgtcgcc ttcgttg^ *f * cggg cgcgatcat g 360 
ctggatctca ccaggatgtt cgccgcaggc ^otcgatgg *J* cgtcC acgtg 420 

ctcgcgattc gctatcccgc acgattcgcg ZtZlJl gcccgacttg 480 

cccgccgact ctccgcaatt cacatcgtcg tacgagctgg J ctggta cctg 540 

aaggtgccgt tcgagaacgg cacgccggtc tg«atca« * cggcaa gaa cgactcg 600 
cggcagcatc cggagcagga catcgggttc atcacgttct eg gg caggcaaccg 660 
gogategget ggcgccaggc «gtoj«tto ctgaagaege ^ ggaagg tggc 720 
cacctgttcg tetggggeca ggagggacac ggee g g y gttcagccgt 780 

ggcgaacggg agatgecget cgacctcaga -ga-caga gectg gg^ * gccgggcag 840 
tgC acgctcg acgacgatcc gggagacggg tcg^tog. gcg^ ^ 
atcaactcct acgtcacctg gcagcctgac 9^ ccgcagtcga cgt cacgccg 960 
gtegtcatea agctggcgaa tcgaccgccg ° * ^ gaa ctccgcg 1020 

cggcggctgc agcagttcaa gctcaagccc ^gtct cgtcacgctc 1080 

ggcaacaccg tcgtgcagcg eggegaageg etaggeggge 1140 

ccgcagatgc aggtctcaac gggtggaaac ^atcctgg 9 * ggctgag 1200 

ggcg tcctcg acgagccggc gcatcggcgc «**g«t£ g g ^ 

ZZIZ E= -iSi — c ~ 1320 
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aagtccccct gcgacggccg gggtcacgac gttcgggagc cccgcggcaa tcgcttcgag 1380 
cactttgttc tgcacgcctc tcgcgatcgc aatcggcgcg accgcgatcg ccgatcgcca 1440 
gagatactcg cgtacctccg ggaccgcgcc ggtcacttca atcgtgggat cggccgcgag 1500 
tgcccgcacc gcggcggtcg ggtccatgcc cacgagcatc agccgcgcgt ccgggcgggc 1560 
gcgaaccacg gacggccaga cctctcgcgc gaaccagagc gccgcgcgct cgttgggctc 1620 
gtagccgaac acgccgcaga agacgacccg cgcttcggcg gccggtggtc ctttcggcgc 1680 
aaacatctcg acgtcgatgc cgttcggcac gaccgacgcc gtgaggccgc ccgtgacttc 1740 
ttcgagcagc aggcgctcgc gctcgctcac gaccgtggtc gccgccgcgc gccgcatcgc 1800 
gactttctcg aagcggagca agcggccggc ctcgcggcgg aacagccagc ctttcggcca 18 60 
cgccgccgtc gacgccagcg cgcgccattt ctcggagtcg acgtccacca tgtccaggac 1920 
gaaggggatc ccgcccagga tcggctcgca ggcgtagcgg gccatcccgc tgcagtacgc 1980 
gagcaccgcg tcgggccgcg tcgcggcaat ccgtcgttcg agcacctggt agatctccgg 2040 
cgaatgcagc aggacgtgcg tgaggggtcg atcgcccggc agcgccagac cgctcgcgat 2100 
gagattccgg ccgcggcgga cgcgcacgac gtcggtcgac gccgtcacgc cggcgagatc 2160 
cgctcgatgc gaccactcct cgtcgtcgtg cgcgagcgag acgaggtgca cgtcggcgga 2220 
tcgtgcgagc gtgtgaatga gatgaaacgc gcgaatgcga tcgccccggt tcggggcgta 2280 
gggcagccgg tgtgtgagca gaaggactct caccgtgcgg catccgatcg ctctcggatg 2340 
acgcgggccg gtacgcccgc ggcgatcacg cgggcgggaa tcggcgcggt cacgaccgac 2400 
cccgccgcaa tcacggaatc cggaccgacg tcagccatga ccacggcggc gctgccgatc 24 60 
caggtgccgg cgccgatccg caccatccgc ggtgtgcctg gctgctcacg gatcggccgc 2520 
gtgaggtcgc tcgtgccgtg cgtgtccggt ccgctgggga tgtggacgcc ggcgccgacc 2580 
aggacatcgc gctccaggtg gacccagccg agatggcatc cgggtcccac gtacacatgc 2640 
tcgtcgagct tcgcgccggt ctgcgaaaag atggtgccga aggcgaccgt gaccgacggg 2700 
tcgcagtgtg caagcacgcg gccgaggaaa gccgcgcgaa ggtattgtcc gatgacgccg 27 60 
gggatgagcg agagccactg cgtcgaccct tcgagggccc ggtcggcgcc gaggaagggc 2820 
cgccgcacgg agtaggagag caaggcgggc aggacgagca ccagcgcgat gccacgggcc 2880 
agtcccttgc cggcgtcctt gaggctcacg cccactgcag cgcgcgcccc gcgtcggacc 2940 
gggctccgta tcgtgtgacg agcgagtcgt agagcgcttc cactttgcgc atccgccgct 3000 
cgaacgataa ctcttcttcg atccggcgtc gcgccgcaat cgcacgggtg cgtccggccg 3060 
cgggattggc catcgcctgc tcgatcgcgt ccgcgaggcc tgacgcccgg ccgggctgaa 3120 
tgaccagggc gtgcgtgccg ttcgacacga gctcggccgt gcccccggcc gcggtcgcga 3180 
cgatgggtgt ctcgaacgcc atggcttcga gcacggcgtt gggcgtccct tcgtagtccg 3240 
acgactgcac gagcaggtcc agcccgtgat ggacggccga gacgtcggac gtgtggccga 3300 
ggaaacgcca ggttccgggc ccgagctcgc gggcggcctg ttcctcgagc ggacgccgga 3360 
ggctgccgtc gccggcgatg acgagacgga ggcgcggcca cttcgcctga agggcggcac 3420 
aggcttcgat gagcaagtcg aatcgtttct gcggctcgag acggccaacc gcgccgatga 3480 
cgaagtcgct ggcctcgaat cccagtcctt cacgcgccgc ggcctcagat cgtggatcgc 3540 
gtcggaaagc atggtggtcg atgccattcg ggatcgtgac gacccgatcc ggccgtccgc 3600 
ctttggccag gagctcccgc cgaatctgtt cggacacggc aatgacggca gggaagcggg 3660 
cgagaatccg gcgatcgagc gggtagtaga cggtgcgttc gcgaaacgaa tggccggtcc 3720 
atccatgggc cgtggccagc ggaatcacgc gctcggcgcg tcccaggagc agggcgagga 3780 
gatcggtctt gtactcgtgc gcgtgcacga tgtcgatgcg cagatctttc acgagcgcgc 3840 
ggagcttcga ccagattccg gtgtcgaacg aatgcttctc ccggacctcg acgtagtcga 3900 
tgccggccgc tttcgcccgt tcgtccatgc cgaacaccgg atcgcgctgg tcgcggatgt 3960 
agcagacggt gatcgcgtac ttggagcgat cggcctgcgc cgttccgagc agaatcgtct 4020 
tctccggtcc gccgcccgtg ccccgcacgc tgcggagctc gagcacgcgc acgggacgcg 4080 
cgattcccgg cctaggcgcg gcgtctggca gcatggcccc cccggccgac gagcgcgatg 414 0 
gcccgggcga tatcgtacgt gccggccacg acgagcgact gtgcgatctc gacgagtccc 4200 
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actgcgccgc gccggcgccg gatcatcgcg gctgcgcgca agccggcgaa accggcaagc 4260 
agcaggagcg cggtaccgac aatccaccac ccgccgcgct gccacagcgc gaggcccagc 4320 
ggaatcgaga cgagcgccag cagatcgagg accgggatga tcacgctggg cagatcccgc 4380 
atggcgagcc ggtgacggaa gctcacgcgc aggttgtcgc ggccgcgcca cagctcgccg 4440 
aaaaacacgg cgcgaagcga tcggggatcg ccctggtgca cgctgcggag ccgcgggtcc 4500 
gacaggatgc ggccgccgcc ccggcgcagc cgctggcaga tatccacgtc ttcacaggtc 4560 
tcgaggctgg tatcgaagcc accggcgcgg tcgaaggcct cgcgcttcac ggcaagattg 4620 
ccgctcggga gccagtccac gtcttcggtg ctctgcccac gacggcgaaa cgagtcgtac 4 680 
gcgcgctgca cccagttgcc gtcaccggga gcttcgtact gcgcgccggc agcgacgacc 4740 
cccggggtct ggagcagatc gacggcgacc accggccagc gcgggtcgat ctcatgatcg 4800 
gcgtcaacga acgccagcac gtcgccctcc gccgaccggg ccgcgcggtt gcgaagcgtc 4860 
gccacgttca cgtcgggcag cgacagcact cgcgcggatc cgcgctctgc cacggcacgc 4920 
gagtcgtcgg tcgacccgtt gtccgccacg acgatctcgt accagccagc cggcgcctcg 4980 
gtgcgatgaa tcgacgccag gcaacggctg aggtgcgccg cgccgttctt gacggggatg 5040 
acgaacgaca cgcgcggcgc cacggcaatt ttcgcggtca tcgatttcgt agcgcggccg 5100 
gctttcgtag cgcggcccgg gttgggggcc gcgagaactc gaacgcgccc agatccggcg 5160 
cccctgctct cggccgccga tcgaaatcga gcggcgcctg cggaatctcg atgccggcgt 5220 
cgatcgcctc cctggcggct gccgacaggt gcagatcgca cgaggccgcg tttgcgaacc 5280 
actctgcggc ggcgttcgtg acgttgttcg tcgcctccgc ctcaccgccg ttacggcgca 5340 
ggatctgccg gctcgtcagg ttgttgcgga tgagcccgtt cgtcacggga aaccggacgc 5400 
cgatcgtcca cggcgtcgcc gtgccgagcg tgaagacgct gttgtgctcg atccggaagt 54 60 
cggtggacgc gttggcttcg atcgcttcgt cggcccattc gttcaggttg cagatcacgt 5520 
tgttccgcac gatgccgcgc tggtgatcga tctgcttctc gccgtcgcgt gcgtaggccg 5580 
tgatccccgc ggccaggccg atggcgatgc ccctgaacga atcgacgatg acgttccgtt 5640 
cgatgagcgt gtcctgcgag ttcgcccagg cgaggatcgc cgggccggat cgccagccgc 5700 
ccgattgcgg gccgcgaatc cgcaggaaga cgttgtcgcg gatgacccac ccccgggccg 5760 
ccaggatgtc cacgccgtcg gtgtaatcgg acggcgcgct cgtcgtgtac tcgaagcgcg 5820 
agcacgccac gaggccatcg tccgcgaagc gtccgtcagt gccgacgctg cctttgagca 5880 
attgctgtcc cgcatcgatg agctggacgt tgtgcaccgt ggctcgggac gcgccaagct 5940 
cgccgcgcac ctgaatcgcg tggtacccga cgtggccgac cgtcagatcc gcaatcgtca 6000 
cgtcagtggc gccaaccgag agcgccacgc cgatcgcgtc gccggtcatg ccgccgccgc 6060 
gaatgacgac gcggtcccga tcgccggagc ggctccggag cacggtgccg gggacgttga 6120 
tgctggccat ccggtcgagc cggtactcgc cgttctcgag cagaatcgtg gtgcgcggct 6180 
tgacgcgttc gagcgcctgc agcagctcgg acgtccgccg cacggtgacg acctgtttcg 6240 
tcgccggcgc cgcccatccg cagaagtgcg gcaccggcgc ggcggcctgg ccggacggca 6300 
gaagcgccac gagtgcgccg gccacgatga ggcgagtcat gccgccaccg tgtcgacgtc 6360 
gtgttcgcgc aaccagagct cgaggatgag caggatttga aggaggtagc cgcgcgactg 6420 
tctcccgtga cgcgtgtcgt cgatgatggc gcgcaacgtc tcggggcgca ggaacatccg 6480 
cgtgcgcgcc tccggggcga acagctcgcg ctcgacggct tcgagcagtc gcgtctgcag 6540 
ccagcgatcg aagtcgtgat agtgacggta cccaggcgcg ttcaggcgct tcagcagggt 6600 
gttgaccttg tcgaacacct tctccgccag cggaccggcg tcgccgcgcg cgccggtgtt 6660 
cgagttccgc accttgagca gcgccgggtt gccggcctgc gtgatcgcgc gatgaagcgc 6720 
cgtgtcgtcg cgccatgccg gatcgccgct cagcaggacc tgcaggaact cgcgatccat 6780 
gaacggcagc cgcacgtcca cgacggtgcg gaagagatcg agcgagggaa tcgtgaagcg 6840 
ccgatgatgc tcacgaagat agagatagct gctgagctcc gcgggtgaga gctccacggt 6900 
gtcgaggagc gcgtgcagcg agtcggccgc gccccgtccg gagaggcgcg cagcgtctgc 6960 
tgtgaagagc gtgtccagcg gcaggccgct gctgatgtag ttggcgcgcg acgtcatgta 7020 
gtcgatgaac tcgccgcggc ccgtcatcgc gaacacgtgg ccgtcggtgt ggagcggcca 7080 
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cgccaggctc gtcttggcga gctctccgcc gtggccgcgc agcaggaccf cgataccgga 7140 
gtcccgcagg aaatcgatcg ccagcatctc ggtcagtccg tggctcagat acatgccatc 7200 
ggtgagcgag accatgcgcg cctggttcgg cagaaagtcc ttgaggtacc ggtcgtcgag 7260 
ctcgaagaag cggtggtcgg taccggtgag cttcgacagc tgttgcgcga tcacctgatc 7320 
ggcgcagccg gcgacgccca gcgtgtacgt gatgagcccg ccaacaccgt tcaccgcact 7380 
cagcacggcg cggctgtcga ggccgccgga gagcgacagc ccgaagcgat gcctgcccga 7440 
gagcgacttc tcgaccgcgc gcctgaacgt cgtgtgcacc gcgtggacgt acgcgtcctt 7500 
cgtcgtacgc ggtccgcggt agagatcggc cggctcgaaa taccgggaca cgctcgcggc 7560 
gcctgacggc cagtcgtacg cgagcacggc gccggggtcg aggagcgtga cgcccgccgc 7 620 
cagggtcttg tcgcccagga cgaagcccag cgtgacgtaa tcggcggccg cccggaggct 7680 
gagagccggt ttgcgcgccg aggtcctgag cacggaagcc agatccgatc cgaacgacag 7740 
tccggccgcg tcgacccgcc agtagatcgg atacgagccg aacggatccg ttgcaagcag 7800 
cactcgctct cgcgccggat cgatcacggc gagcgaaaac gcgccgtcga gccgggcggc 7860 
catggcaggc cccagccgtt cgtacaccaa agccaggaga tcggcgcgcg tcccgttcgc 7920 
cgaaccaaac tcgctccgga gcgccgcttc gttgtagaga tcgccgcgaa agatcaccag 7980 
cggccggctg acgtcgcgcc cggccgccac ctcgtcgagg aagccgagcg tgccgagccc 8040 
gaccgcccat cggtccgacg gatctcgata gacgaccgtg cggcgcggcc tgcccgactt 8100 
cgccgctgac gcggcgtcgg cgcggctgaa ggcggtagat gccggctcat cgcggcaaac 8160 
gacgccgagg aggccgttca tcagacgacc ccgcgggcag acacgcgcga cggaacaaga 8220 
ggtgtcgtct gagcggccgg aaccggccgc gcggcgtgcg ccgccaccat ccggttgagc 8280 
gcaatcgaca agccgaggaa gtgccagagg atctcggtgt actgaaagat gacgaacgtt 8340 
ccgcccacgc agaatgcaat gagggcggtc tggacggcca gcgcgtacgg cgccagctcg 8400 
gccagggccg gatcccgctt cgagacccgg cggaccctgc ggcaggcgat gagcgccagg 84 60 
aggagagtga gcaggaacgc cgcgaggccc acgaacccaa gctcagccac gagaccgaac 8520 
cacgcgctgt gaaccgagcg ccccttgccc caggcgccgt ccgacgtgtc gtagtcgttg 8580 
taggcggccg tgaaggcatt gtgaccgacg ccgtaaatcg gccggtcatc ggccatgcgg 8640 
agcgcgaccc tccagaagtg cagccggctc tgggccgatc gctgatcgcc aaccgccgtg 8700 
tcgctcgcgt cgatgctcga gtcggacgag ctgatcgagc tcatgcgatc ccagaactct 8760 
tccgtcatca cgagcgacat cgcgacgaag accgccgccc cggccacgag gacgagcatc 8820 
ttgcgcttcg agtacaggaa atagacgagg ccaagcacgg cgagcgtgag gaagccgccc 8880 
cgggagtacg tgctgatgcc gcgatagagc accccgatga gcagcacgcc cagcgcccac 8940 
ttggcccatt tcctcgcttc cgtctgcatc agcgtgacga tgagcggcac gagcatgaac 9000 
atgccgacgg cgacgccgtt gttgtcaccg agcgagtaga cgtcgttgaa gttcttgatg 9060 
ccgggaatga acaggagctg accccagccc tgcttggcgg cctcgaaccc gagggagagc 9120 
gagatcacga ggaagacgag ccgcaaccgc ttgacgtccg tcgtgaggac tgcgagcagg 9180 
tacgtcatca cggtggactt catgaagtcg atcatgtagc cccacgcgaa gtcctggtag 9240 
ggcgacgcca gcgtcgacga caggctgtgg agcccgaaga aggcgaggag cgccagccgc 9300 
gcgtcgatcc gcagcgcctg cccggacagg aacgcgatcg cgagcgtgta catgcccgcc 9360 
aggagcgaga tgttcatgtg gatgagaaag tcgctccaca cccagagctc cggccggaag 9420 
tacgcgatga acaggtagaa cagaatcgcg tagaacggcc cgcgcagcgc gtggaaggcg 9480 
ccgaaggcga ggagggcgag gacgaaagcg gctcgaagca ttgccgtgtg ttacctgtta 9540 
tcggtcccga tggtccagag cggctgcgtc cgggtctcct ctacattgaa caccttgtag 9600 
agggccggaa tcgaggtgaa catgagcacc acgaaaagaa tcgccgacac caccatgtag 9660 
gcgaagaacc cccgctgctt gtagagcttc tccggattct gcacggggct gttcggcagc 9720 
atcccgatgt gcaggtagta ggcgaacatg cccgccgcca ccggcgcgaa caggatcagc 9780 
tcgaggtgat agcggacgat gaagatgccg ccgaagaggg cgcagcaggt ggcgtagaac 984 0 
accatgctca cgagcagccg ctcctcgttg tagaaggcaa acgacttccg gtacgaggcc 9900 
gcgaccgacg cgttgccgat gtgccggaac tccgcgtacc gcttcgtcgc catgaagaac 9960 



WO 2004/013327 ^P»CT/EP2003/007765 
^ 5/32 

gcgcccacca tccagtacga gatcacgagc gacatcggcg gcacgcgatc ctgaatcagc 10020 
gggaaccagc cgaggaggag ccggacggcg ttgttcgccg actcgctcag cacgtcgagg 10080 
tacggccact ccttcgtgcg aatgggcggc acgttgtacg tgacgccgag cacccagagc 10140 
gccagggccg acagcgcgaa gtatctgttc acgagcagcg cgagcacgaa tccggccacg 10200 
cccacgagga tccactcggt gtagcccgcc gccggcttga tcttcccgga gggcaccggc 10260 
cggtgccgtt tctccggatg cagcagatct ctcgggccgt ccaggagctc gttcagcacg 10320 
tagttgctcg aggcgatgag gcaggtcgcg gcgagcgcga gcgcgagcgg cgggacggcg 10380 
gtccagccga agagctgcgg ttcgtagaag aacgccagca gcacgccgag cagcatgaac 10440 
gcgttcttga accagtgatc gatccgcgcg atctggacgt acggccaaat ccgcgaaccg 10500 
gagctagcgg ctgacgacat cgttgagctc ctcgagccgc cggatcgcga accatccgcg 10560 
gccggctgtg acgccgtggc ggccggcgat gagggcgcag gagagtcctg cggccatcgc 10620 
tccttctccg tccacgtccg gtcgatcccc gacatacagg acctcccgag gtccgagtga 10680 
ccacatctca cacgccacat ggaagccgcg cgggtgcggc ttcagggcgt tcacgttgct 10740 
ggcggtcgtg cagagcgcca gcgaaaaatg ctgcgacgtg ccgagcgcgg cgagcttgct 10800 
ctcgggcgcg tagtccgaca gcacgccgag cctgagcccc gccgcgcgaa gcgagtcgag 10860 
cgcgccgagc aaccccgggc gccggcacca ccgcagatac ttgagcggcc ggcgcaccat 10920 
ccattcgttc accgtcgccg ccacgacctc gttgtccagc ccgagcgcct gcgcggtgcg 10980 
cgcgatctgg cgcgccgcga gcggctcgtc ggccgcgccc aggcgccgga gatcctcatg 11040 
cgcccggcgg tactcccgca cgatgcgcgc cgtgtccact ccgcgccgcc accggaacgt 11100 
cacgagcggc gtggcggcga gctccgctgc catggcggcg cgcaacggac cctgccggta 11160 
cagcgtgccg tccacgtcga acagcacggc cttgaccccg cgcaacggct tcactctgcg 11220 
aggcccagca tccgcagaat cgccgacacc tttgacgcgt cgcgaaggaa cacgtcacga 11280 
acccacacgg cagccagccc cgccaacagc accgccgctg ccgcgcccgc tgccgtccgc 11340 
cagctcccgc gccaggcgcg cgtgtcgatc gcggcggcgc cgtagcagag cagaatcggg 11400 
acgagcggca ggtgataccg gggatgtccg aagacgatcg cgtgaatgcc gctcatgaac 11460 
acgatcaggc tcaggagcag caggtgcgcg cgccagttga ccggccgcgc gaacgagatg 11520 
ccgatgacgc cgagcaccat gaccgccgcg tacgatccca tgatggacac cgccccggcg 11580 
aagaacagcc accgcggcgg acggtagaac ccatttccga cgccggcgat gaagtcgcgc 11640 
tccagccccc agaagtctgc gaacttcagc accgaacggc gaagcgtcgt tcccggatgg 11700 
gccttcatgt agtcgagcgc ctggcgctgc gcccacttct ctttcgtgcc ctccgtccag 11760 
cggctgccgt ccggcgcgcg aggcggcatc gtgcgcgacc attccttgtc gcccgtgagc 11820 
gagatcgcgt cccacatgcg atcctcgggc gtgtacgcgt agttgcccat catcaggttc 11880 
agcccgccga gcgtgtccac gaccgtgaac gagcgttgca ggagcgtgtt gcgcacgctc 11940 
cacgggccga cgatcgcggc gtatccggcg aagagaagca cggccacgcg cagacgcgcg 12000 
cgaagcggga tgcccagacc gacgagcgcc gcggccatga ggatcgcgac gtacggccac 12060 
atgatcgagc gaacgagcgc ggacgcgccg agcgcgacgc ccgtcgcgag cgcgatccac 12120 
gccgagtgcc gccctgatgg gcgatcgatg agcgtcagac atcccaagag caacaaaaga 12180 
aggagtgtga tgaagagcgt ctcggacagc accagcacgc cggagaacag cagcgaggga 12240 
tagaacgcga acgccgccgc ggcaatgagg ccggcgcgtg cgctgaacaa cctgcgcccg 12300 
atgagataca cgagccagac accgagcagg ctcaacggaa cctgcgccag ccggaccgcc 12360 
gtcagcgatt ccgagcccgt gaccgcccag acgcccgcga cgaacgcggg aaagagcggc 12420 
gcccggatgg acgtcggttc ccccggcccc cacgcgaacc cgttcccggc gacgacgttt 12480 
ctggcaagcg tcgcgtagtg ctgctcgtcg cgaatctcga gcgcgacgtc cttgaacccc 12540 
caaatcaggc cgagccggag gaccagggcc agcgtgagga tgacgagaat tgcccggcgc 12600 
tcccaatggg tgcgctcggc ctcgtcgcgc ccgggggcga cgattggcca gggggccgac 12660 
gatgggagtg ggggagcctg catcattcgc taaaacaagc agttacggtc aaacctgtgc 12720 
caccggggaa tcagcaaaga ctctgccaac gggtccgccg gcggcgacgc ccggcgcgcg 12780 
ttttccggcc ggaagtcggc ccccaaggcg acccggctgg gcacagccca gtagaaatct 12840 
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- .„n*tttc qaggggcctt cgtgacgccc 12900 
eattctggga tgacttttgt ££££ cggccgtgag 12960 

tggggatggc ataagcttgg cacctgcgat ggcg g tggc gcggC ggttc 13020 

Srctccaga aggcctatcg gga = ^ ^ c 

cagtcaccgc tgggggcgct cctccattcg eg J gtcC agcgcg getgacegtg 13140 
cgcgcgcagg geatoeggeg cgeggeggag ^cgccca gatgettcac 13200 

Tacatcgcgc cgctcctc.a ccgcgt.ac, cg c 1326 

gaagcgcgcg ageggctgeg cgaccggacg ct g 9 tctacacctt caggctegtg 13320 
g acgcgtttc gtctgccgct cagcgggccg agafctgcgac gat tctc gg a 13380 

cgccatttcg agcgcgccga teggctgega tgtct gcgcc gcttcgcgcc 13440 

cccggcggct ggctegtett egatgeggtg * cggg ctacgc 13500 

C a C ac g g g cgaagc ccggcgagta c.agcactac tcgccggt tc 135 0 

gacgagctgg ccgaggccgg ^^cacgatg caacggtg ct cgcgcgcgcg 13620 

cccacgctca tgaagtgcca gatgtatctc ^ tcgt cgt at g ccgc 13680 

C gcg at 9 ggagg tgatcgatcg ^ agaagcgatt tcg aa gg aa g 1374 

cgcgggtgac ctactggacg ggtacctggg tcgttc tC gccggccc 13800 

tgaacgccct ccgcaccggg tegegggega g g gcctggatcg 13860 

iregcagocg tctcgtgccg aaagatc.cg , ^ 

. cccttcgggc cgcggccgcc gtgetcgage eg g gg t ctcaC cgccg 13980 

^gttete et.gcatete gtcgaagt 14040 

tcaaggaaga tggcgctgcc gatctgcccg OT 9 gcgtg cacctcattc 14100 

gccgcgctcg egaeaegtgg atcgcggccg J**^ c f cgtcggag ag attcac g c 14160 
Itcccggcgt ggaeetegae tggt*^ eggtge^g ^ 0 

tgctcttcgc eageaegeeg * 9* ggtgccgtgg egeeggtggg 14280 

tegtcgaget ggcgcggctc egtceggaga teg j> tccgccaC cg aacttcatcg 14340 
gcg acgtcga cgcctccagg cgcgcgctcg gegggegcae gccaccgtcg 14400 

tetegtaega ggacgcaccc gaeatgegeg ^ctegag ggattggcca 14460 

tg tgcttcgc gggcggcacc ggcaaggect ggagat ccte gatcgategg 14520 

t g gg g gacggcc gtgteteteg acgegegaga ^c-t^ 

gcgccggcat catcgccgcg egegatgceg eg * gag acg ttcgacc 14640 

^ee.getg ggeeggetge geegagegcg cg cgccgg ^ 4700 
Lggeegett ccgcgccgc. t^ogo ^JJoJo ggacggctgc tegggtgggc 14760 
gcgagcacgc gaggcgcgcc tettgaegaa W9 tcgcagatg a agcgcgacgc 14820 

teagaaggeg aacgatttcg ccgactacgt ^tgaaee ggatcctcat 14880 

gcgggcgaat cgecgeagge teeateggge ggcgt ecteg gggccetgeg 14940 

g cgt g gcggctg gccagcatcg gcgacgt^ q „o^ tggccgtcat 1500 p 

cgcgcagtac ccgcaggcga ^tccag gatctgtege geetgteega 15060 

ccggcaccac cccgcgatca cgtgcc geeg gegagcttte teagaaegtc 15120 

atacaactgg atcatcaacc tccagaacct eg « cqc agccgtt ccaggatcat 15180 

fgggctctcg tacccgcaga ttetegagea cgc^ fcctactgcat 15240 

caccggccgt cacctcgaga aeggctegga gg cccgc taccc 15300 

ctgtgagatc gaggagcatt tcctcacggc **^J gcacga tc g a agttcgcgtt 15360 
gg gaccgcg attcacgtgg aegeaeagge l actcgqtcg gg t g c g gcgc 15420 

cccggagaac cgccaggtgc t«c,.tctt <£«™ cgcc tcgtgc gtcacttcga 5480 
cgatgagggc tttegeaegt £oo££ eta ^g cc 15540 

agatcggttt gegatcgeca teategggea « ga tctcgtgg acaagaegtc 15600 

a cagtaccgc gagatgeteg ^cctco ^ c f/ ^catcagtt gcgactcgag 15660 
gc tegaegag ctcgtggccc gggctgtacg tcaacgacgg i 5720 

ccccgtccat ctggcgatgg cccgcgg y 
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^-tc qtggccatca acagcacgcc 15780 
gacgttcagg atgcacccga cgctcggcga g 95 gcgacccggc 15 840 

gccgtgcttc acctactcgt ggcgatggaa ^tcttctg * gcattccgct 1590 0 

9 gac cagatcg cattactgtc acaacgaggc ^fccgatg caggacctca 15960 

ggccgcaatc gaccgcgcgg cggcgcggtt ^t^agggg * ctcgtcgatg 16020 

ttcgcgcgca gtttgaagag tcgatcgaga e*"*££ /^gacac aagatcctcg 16080 
cggtggccga gtcggcgcgc ctcctcatcg W*« ^ccgag ctcgtgggcc 16140 
t'gtgcggcaa cggcggctcg gccgcc.a t cat tt ^ ^ acgtcgatcc 16200 

gcttcgagaa ggagcgccgc ^tgcctg ccgtgg g tgagtgcctc 16260 

tcaccgcgtg gccgaacgac tacgcgtacg agacc g ^ caaC atcctc 16320 

ggagcgccgg gcgacgtgct catcggcatc ^ * cgcgctcac cggacgcggc 16380 
gccgccatgg acgaggcgga aaagaaacgc J^ccgtc agatcagacg 16440 

ggcggcaaga tgaagagtct cgccggcgtc ^ gggcgaag tt gatcgagaaa 16500 

tcgaggatcc aggaagtaca cattaccgtc £glogto gaccaggagt 16560 

gcgctcagct gagctggcgg tagacgtcga gcggcgg agg tgctgatcct 16620 

gctcgcgggc gatgaggtcg gcggcggcgt gtccgcctgg acgagaaagc 16680 

cgaacaacag ttgcagtcga ttcgcgtagc ^oggogtc g 9 atcacgggca 16740 

cggtcttccc gtcttccacg acttcacgca ctcgtgcgtt gacggcagca 16800 

cgcgggccga ctgggcttcg agcaccgaga ^ggca^ccc c g S gca 16860 

ggaaggcgtc ggcggcctgg agcagcaccg ' g c Ctg ctcg tgttccttgc 16920 

cgcgatcctc gagcgcgagc tcggcagaga caggccgcgc gacat cagca 16980 

cctcggtccg gtcgctcccg accagccagc actcgaccgg gcgagcatga 170 40 

gcctggtcgc ccggattgcc gtagcctgcc *ttjtgtgg * * cggtcgccgg l7100 
ggacgacgaa gcggtcggtc tcgaggccgg gaccggcacc ttgtcccggc 17160 

ccacgaactt ctcccgatcg accgcgttcg « g£jacacgtg aC gatgtgct 17220 

gt ccctgctg cgccgcgtcg .toatajo* og.t££ ^ 0 
gcggcgggct cttcaggacc caggcgatct ^cggg acgacaggat 17340 

cggccacgcg cgcgcgcgcg gcctgcagcg ^ 9 cggaagcatg cgcg cacacg 17400 

tgtgcacgtg gaccaggggc cgcccgacga gtcatacatc cgccgccgga 17460 

cgg cgagctg ttgcaggccg ccggcccgca g ^ gccgtgcacg 175 20 

cgccctccgc gtcgagcgcg gcgcccgcag *»^* cagcaccgcc gcgccgocga 17580 
ggacgccgcg gcgtttggcg gcggccgaca attcatggtg gcgcgcgatg 17640 

ccagatgtga gacgaggatc tcgtgcaggc WOT^ gcccgcgcgg tagagcgagc 17700 
cgg ccgggcc gccagcc^ = gt J gcgtg agatcggt ,7760 

gcagcaggct gcgaagtccc gacgggagg j» cgcgtgaaac tcgccgcggt 17820 

gccgcatcgg caggcccgcc tgaccgaacc a t gatc gctg gcggcgcgat 17880 

cgtactcgag gtggccgagc gcgagcagga J cctgacc atgC cttcga 17940 

gc gccgactg atcgctcccc cgcgccagga cctgtaatgc gtcaggggtt 18000 

gcatcacgat ggacgacttc gacatgttgg ccagaggtcc cagtcctcga 18060 

Lgggatgaa accgaccggg ooott^ ££££ ^ ^ tg accatgaccg 18120 
cgccgtggat gccggtcgcg "catoboct gccg g acagcgcccc tcgagcggcg 18180 
aC gaagcggt cacggcgttt cgcaggacga ggtcgggg g CCC gtatgaa 18240 

gaagatcgag cgtgcgcagg acctcgccgg ^cgtogat * £ ttggC cggat 18300 

tgaagctcca gtcggcgtgc tgtcggaaga ^cgacotg ^ gcga t g gcgc 18360 

cccacgagtc gtcggcgtcg agcagcgcca cacc'tgatg gcgccgocga 18420 

s= =~ =i: = = - 
zsz ss= — < - 18600 
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gctctcgagc ggcttcagcg gccatggccc tctgcactgg cgtcgcgagc gaacagcatg 18 660 
acgtttctcg aaagccacgg agacaacggc gcgctcttca tccgctcccg cagcgtctgc 18720 
gatgccgggc cgaggtagct gaacccgcgt gccgcgaaga tcttctgcca gtactcgagc 18780 
ggctgttcgt tgatgtggcc cattccgccc tggcccggct gcgcggcgct gaagagcacc 18840 
agcggcgccg tgttcgtgag cacggcgacg agcttctcgc caagcggcgg cggaaggtgt 18900 
tcggcgacct cgaggctcat ggagaggtcc gttgcgtccg cggccgtgga caccgccgtg 18960 
agatcgaatg agttcagcgc gacgccgagc ttttccgcgg cgtaccgtcg cgcgtactcg 19020 
gatcgctcgt agccggccgc cttcacgcct cgtgcgagga acgcgtgcac gaagtgaccg 19080 
gtgccgcagc cgaagtcgtt catcgatcgg atgccgggtc gcaacgccag gaccccatca 19140 
acgagcacgg gcgccgcctc ggcggcgacg tgcccgaaca gatccatgac ggccgcgttg 19200 
tacgtccatc gctcgaagcc cagccgttcg ccgatccggg acatcttcag tccggcctgg 19260 
caacgccccg accagtagag ccgcttgagc tgggtcacga tcatgggttc gtcccgggcg 19320 
cgacgcgaac gcgcggcggc gcatgcagcg acgccgagaa ctccacatcg gccacgccgg 19380 
cccacgatcg atgttccttg accgccagcg tcgcaagctg ccgcgacggc acgaggaacg 19440 
tctgggtcgg gttgtgaaac acgtgcaccg acaggtagta gtggccgcga acgagatgcg 19500 
cgcggaaatg gaggtcgagc gagaactcgc ccatgaacgc ggacgcgtcc ccgagatccc 19560 
gccggtggat gttgccgtcg tagacgacga gctgatcggt ggagcgttgc accatgaagc 19620 
cgaacgtgac gtcgtccagc ggctcggtac ttctcgcggt gacctggagg ttcagctgct 19680 
tccccggcgt cacttcgctg tgcgtgcccg acgactcgtc gccgacgaag agcgagccgg 19740 
acacgatctc gacggggctc ccgacactcg gctgttccga tttcctggac gcgcggacgt 19800 
acgcgtcgag gacttcgatg gccgtgccgg cggccttcac cgagccggcc agatacaccg 19860 
catgctcgca gagggtcgcg accgcctgca ggttgtgcga gacgaggacg atcgcgatgc 19920 
cctggcgctt gaactcctgc atgcgatcga agcacttctg ctggaacgtc atgtcgccga 19980 
cgctcaacac ctcgtcgatg atcagcacgt cgggccgcag gtgcgcggcg atcgagaagc 20040 
cgagccgcgc gttcatgccc gacgagtagc gcttgacggg cgtgtccacg aagtcggtga 20100 
cgccggcaaa ctcgacgatc tcgtcgaacc gctcggcgat ctcgacccgt ttcatcccca 20160 
tgatcgcacc ctggaggaac acgttctcgc gcccggtgag gtcgggatgg aaccccgcgg 20220 
cgacttcgat gagcgcgccg gcgcgccccc ggacgatcga gcggccggtc gtcggcttca 20280 
ggatcttcgt cagcaccttg agggtggtcg acttgcccgc cccgtttgcg ccgatgatgc 20340 
cgagcgcctc gccgcggtgg acttcgaacg agacgtttct gaccgcccag aactcctgtt 20400 
cgtccagcgc gtcggcggga cgcggttgaa acagcccgcg cacgagcgcc ggcacgaggt 204 60 
cccgcaggct gtcatgacgc tcgccccgcc ggaacttctt ggagacgttg tcgaagacga 20520 
cgggggccgt cacgtcagac gttctccgcg aattcgaact cggagcgatg gaagaacagc 20580 
cagccgaaga cgagcgtcac gaccgagacg atcgccgtga cgcccagcat cacggacggc 2064 0 
ggctggttca ggagcagcac ggcgcggaag ccgtcgatga tcggcgtcat cgggttgagc 20700 
tgcagggc.cc actggagctt gccgcccacc atttcgaccg ggtagaccgt cgacgacgcg 207 60 
aacatccaga cggcgatgac gacctcgaag aggtatttca cgtcgcggta gaagaggttc 20820 
gccatcgaca ggatcagggc gatggccgtg gtgaatatca cgagcacggc gagcacggcc 20880 
ggcagccaga ggatgttcgg tccgaccggc acgtggtagt agatcatgag cgcgacgagc 20940 
accacggagc cgatcgcgaa gtccacggcg ctcacgatca cggcggagaa cgggaagatc 21000 
tccctgggaa agtacacctt cgagacgaga ttcgtgttcc cggtcaggga ggtcacggag 21060 
aagcggagcg ccgaggagaa gaagttccag acccacaggc cgcagaacgc gaacaccgga 21120 
tacgcgaccg gcgtgtcgat ggccgcgacg cgcatgaaga tcacggagaa gacagccgtg 21180 
ttgatgagcg gcatgaacac ggcccacccg aagcccatca ccgactgctt gtagcgcagc 21240 
agcaggtcgc gccgcgtcat ctgcatcagc agctcgcgat actcgaactg ctcggcgagc 21300 
atctcacggc aatcggccag gaaggcgcgc atcgtcagcg gtgtcagtcg cggcgtcgtg 21360 
ccgggcggat catgagggca ggtagcgctc gagctgcccg aagagcaact gcacgaactc 21420 
ggtcgcgcgc cgctcggcgg cgacgtcgcc gttttcgctc gacggcacgg ggctcgtcac 21480 
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gcggacgagc gcggcgtccg atcggttcag ccgcatcgcg tcgtacacca tgtagatctt 21540 
gctccagtac tcgctggcga tcacgcggcc gtggctctgg taccagtaca gcacgatttg 21600 
ccgctcgatc cccttctgga tgacgtatcg gttgatctcg atgggctcgg tgcggcccgg 21660 
caccttgacc agcgcgcggc cggtggatac gggctcccag ccggcgcccg gcaggcagtt 21720 
catcggcgag tggatcgtgt cgccttcgcg ctggctctgg tagtagccga tgtagagcga 21780 
cacgtagggc tcgcgcggcg ccaggtagat ccggttgatg tactcgtcca cgccgagcac 21840 
cgccatgacg tccttggtga agggcgcgct gtcgcgtccg gtccaccgct ccatctggaa 21900 
cgggacgttc gcgagcggct ggcgaagcgg gacctgctcg gcgcgcatcg agcgcgcggc 21960 
aaacgccgcc gtgccgagga aacacacgct gagaatgatc agtcgtctgg tcatgaatct 22020 
ggactcacac cgtggccggc gcggtcaccg ctcggggcgg aggcgtgaag cgccgcagga 22080 
cccagaccac gacgaacaac atgacgaacg cggcaagaaa gacgagccag cccgagaacg 22140 
tgtggaagaa cccctgcgcc gcggcctcgc cgtagtagtg cgccgccacg cccgtgccgg 22200 
ccacacgcgc gccgttcgcg ataatcgcga tcgggatcgt cgagagcgca atcgccaggc 22260 
gcatgccgac gcgcggatcc gtgaagtagc cgtacacgat gcccagcgtg agcagcgaga 22320 
tgagcgaccg gatcccgctg cacgcctcgg ccacttcgag cgtcgtgtgc gcgagcacga 22380 
tcacgttgcc ttcgcgcagc accgggatgc tgagcgcgga cagcacgccc tcgcccacgc 22440 
gagacgccag gagctgcagc ggaaatgcga tctgattgaa gatgatcgac gggatcggga 22500 
tcatcagcac caggaacgcg agcggaaacg cgagcaccca gagcttgcgc cagccgagca 22560 
ggaacacgat cgcgccggta agcgaggcta gtatcgacac gcgcgtcaga aagagctcgg 22620 
cgccgagcaa ccccgcgacg agcagcgcga gcccggcgag cacgatgacg agaccgagcg 22680 
cgctcggcgt gtccggcagc gccgagagac gcgcccggcg ctcccaggca aagtaggccg 22740 
cgagcggtac gatcagaatc ccgtgggagt agttgtcgtc gttgatccag tcctgcgcca 22800 
accgcacgat cacctgccca taaaggagca ggaaactcac ggcggcgata ccgagcggaa 22860 
ccagtctggc ctgaagcaca ctcatagcga tcgggacgaa tatagcaaag ggctctcccg 22920 
ggcagaactt gggcgccgaa acacgtaacc atcacgctca caggaagtta gcggctaaat 22980 
gacgggattg gccggctgga ggggcctcgt gaatctaccc aatggatgac aaatatccct 23040 
atcaactcct ccttttgact gcgtagcacg gaagatcatc ggcctcttca gatgcaggac 23100 
aaatgtcatc tgtcaggaag agtaatgtct ctggccacag cgccggtcaa gtgaaatccc 23160 
ccgacagaat cgcgtaccac cgcgactggc atccggcgag aaaagtcgcc tcgtcggccc 23220 
gaaaactgca ttacaggaac gccaggtcgg cagccggttc gttttccgtc agtatttttg 23280 
ggcattgcag gcgctttcgt ctgccggggg gcggaaccgt gcgaccgggc cgatcacgcg 23340 
tggagcaggg cagggacagg catggatggt tccgtgagac agccgcgttc gggagacggc 23400 
cgggctcggc gggagctcgt ccacctggcc gtggaccgcg cggcgcgcca atttcctgac 23460 
gcgccggccg taagttttga cggcgtctcg ctctcgtatc gcggcctgac tcgccgcgcc 23520 
aacgcgctcg cgcaccgcct gcagcgtcat ggcgtcgatc ctgacgtggt ggtcggcatc 23580 
tccctcgacc ggtccatcga gctcatcgtc gggatcctcg ccgtgctcaa ggccggcggc 23640 
gcctacgcgc cgctcgatcc tgcgtatccg aaggaccgcc tcgagtacat ggtcagggac 23700 
gccggggcgc ggatcgtgct cacctcgcga cgcgtgagcg gtctgatccg cgtcgatggg 23760 
gtcgagctca tctgcctcga tgcggacgag cccggggcgt tggcggaaga gggccatgcg 23820 
ccggacgccg gcgtcgatct cgaccacctc gcgtacgtca tctacacctc gggatcgacc 23880 
gggcgcccga agggcgtcgc gatgccgcat cgtccgttgg cgaatctcat cggctggcag 23940 
gtcgacgcat cggccgtcgg cgccggcgcg cgcacgctcc agttcacttc gccgagcttc 24000 
gacgtctcgt tccaggaaat ctttgcgacg ctcgcgaccg gcggcgagct cgtgctcgtg 24060 
tccgaggaca cgcgccggga tccgcgcgcg ctgctgcggg tgctgcgcga tcgatccgtc 24120 
gcccggctgt atctcccgtt cgtcgcgttg cagcagcttg ccgtcaccgg caccgacgat 24180 
ccggatctgc cggcgttgcg tgaggtcatc acggccggcg agcagctccg ctcgacgccc 24240 
gcgctcgtcc agtggttcgc gcggcatccg cagtgcacgc tccacaacca ctacggcccg 24300 
gccgaagcgc atgtcgtcac tgcgcacacg ctgactagcc cccccgcgga atggccggcc 24360 
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nrtncccqqc gtcgagattc tgatcgtcga cgagcatcgc 24420 

ctgccgccga ttggcgtgcc ^tgc^c g g g gcggcgtgtg cctcgctcgc 24480 

cagcccgtcg cgcacggcaa cgt^gag tctcgcat cc gcttcgcccc 24540 

ggctacttga atcagccgca ^gcacggcc ^acgg cgC gatcgag 24600 

gatgctcgcg tgtatcgcac cggo^tcto Lcgtgtgga gccgggcgaa 24660 

ttcgtcgggc ggacggacga -aggt^ag ccgccgtcgt cgcgcacgac 24720 

atcgagaccg tcctccttca ^-cggt gt oggg * ga tcaggcggcc 2478 0 

gacgcgtcgg gcagcaagag ^^7o ggatggcaga ccgtgtggga ggacacctac 24840 
accggagccg aggcttcggc acaggtctcc gga gg g ga gcagctac 24900 

cgcgcggcgg cggactcgac " ccgatgagac cgtcgcgcgc 24960 

acggggatgc cctacaccgc ggcggagatg cgggagtggg g ^ ^ 

atccggtcgc atgcgccacg gag gg ctcg cgccgC gctc 25080 

ttccgtctcg cgccccactg cgctgtctat agcg g gg t cct ctcgcgc 25140 

gaocacgtcc gggcccacgc gacggcctgc WWcagoo gatgaac tc g 25200 

"gaggcggcgg actttaccgg cgttccggcg SSeog, cgcogcccgc 25260 

gtcgcgcagc acttgcctgg cctgcg cctgctcgag 25320 

a ccgtgcgtc cgggcggatt «tcttogtc atcfctccgat tgccct cgtg 25380 

at gtttcacg cgtcggtcga gctcacgaaa gcgccgg gg tccgatgttc 25440 

cg g cgagcgcg tgcagcgtca gatggcgctc fl=0 c tcgtgatcga 0^ ^ 
ttcgacgccg tcatggaccg -tccccggc tcgtgctcac ggtgggcgga 25560 

ggtgccttcg acacggagat gaaccggttc a ^ ta ^^ " ggtcgC cgag 25620 

Sgccggccg ggcgccgtcc gtccgtcgcg cto^tggo gta ^g gg^^ ^ 
cgtcagctcg ggcaactgct cagcgccgat WW* J tgC cgactgg 25740 

a ccaacgcgc ggctgcgtca gccggccgcg ^cttcgat J c g C gatcgac 25800 

ac cggcacgg ccggcgagac gcttgagagg ^cog^t cacctggtcg 25860 

ccggaatcgt ggoggcgtct cgcgaacgcc gc I ggggc ac cgcgtcgctc 25920 

gccgagatcg gcgccatcgg cttcgatgcg ^ * gcggg L at tcagcagtac 25980 
gccgctcccg tgactcgtca cgcggttccc «J™ tcggcgat tc 26040 

gcgacgaatc cgctccgcga tgcggtggcg ^gtcgtact cgatgcgatg 26100 

ctgcaggagc gcgtgccgga tcacctggtg cggcg Lcga gagccgccgt 26160 

ccgctcacgc caagcgggaa gatcgatcgg W«« gg J gcgcag 2622 0 

cccgatctgg acgtggcgtt tgccaagccc 1°°™™* acgacaac tt cttcgatctg 26280 
gtctggcaga cgacgctcca gatctcgtcg ttcggccgct cgtgccgggc 26340 

ggcgggcact cgctcctgct cgcgcaggcg c ^ tc ct cgccgggttt 26400 

aagcagtggt cgatgatcga ^tgttccag ^ gcgcgcag ga tcgcggacgc 26460 
ctcagtgagc ggccggacgc cggcgccggg ^ctt^ccg aggggtC cgg . 26520 

cg ccagcgcg agatgctgac ^gtcagggg Ue'cgggc gcgcggtcga 26580 

taatgcagaa cggcatcgcc ctcgtcggga t^ccggccg ttcacC aacg 26640 

tcgaggagta ctggcgcaac ctcgtcgccg tacgtC cgcg 26700 

aggagctggc gacggccggg ^cagccaga W«tat«ca g^ atcagccagc 26 760 

cgcgcgggct cgtcccggat ccggacctct ^-gccg ^ tgg ctcgcgc 26820 

gcgaagccga gctgatggat ccgcagcacc /Jggcgtg tggggcggca 26880 

tcgaacacgc ggcgatcgcg ccgcggtcgt tccccggtc ^ caC gcgggat 26940 

tgagcacggg gatgacgaac agcacgtacc tgctgtcgaa tot S fcacctcacga 27000 
tgctccagcc ggaggacgtg ctgccggcgc *gctcggcaa cgag g gct 27060 

gccgcgtctc gtacaagctg cacctgcgcg ££££ "ctcLgtgg cagtgtgatg 27120 
cgacgtcgct cgtcgcggtg f^f^ ggaag gctac gtctacgtcg 27180 

3£S S^'SSS -cg C g tt Ucgaac gcecaaggca 27240 
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cggtcttc g gtgatccgcg gatcggcgct g - at gcgcaga 27420 

gcgccgccga cgtggcgccg acta * ggcggcgcC a 27540 

cgctcggaga tccgatcgag ttt cgggcac ctcgactcgg 27600 

cggagaacgg gttctgogga ctcggatcgg tcaag 9 ccatcaccag caga ttccgg 27660 
cggccggcgt cgcggggctg atcaaaacgg <*ggj cgccaacagt ccgttcttcg 27720 
cgLgctgca ctac g tc gag coca ^ ^^.tcc gc gacgcgcg gccgtgagct 27780 
tgaacgccgc gctgcgacca ^ f c ^ gctcga ggaagcgccg cccgccgcgg 27840 
cattcgggat cggcgggacc aacgcgcacg aag a cgccggcgg 27900 

afcgaccggg cgccgagccc gccagcgacc £ategt^ gacgtcagca 
cLtcgacgc ggcggctgcg gccga cagga gtttccgcat cgacgagcgc 28020 

:rrr c = s= S 
•isr. = =5 is: = ss« s 
— sss — - ss= ss? S 

c 9 c g gcg c t ctt tgtcgtgcag tacgcgctcg cccg 9 gcgggcgt 2844 0 

ccgacgggct catcggtcac * ctgagagg CC ggctgttc gagacgcttc 28500 

tgtcgctcga gaacgcgctc gcgctcg gt g - ^ ggtgcggC cc gcgctcggcg 28560 
cggccgggga gatgctcggc 9 ^ L^ftcgtg cgtcgtttcc ggcgcgcccg 28620 
a tggactcgc gatcgccgcg * tgtcga atgc cgccgcctcc 28680 

ac gcgctcgc ggcctttgcg t c g atccgat cctggcggag ttcgagcgtt 28740 

acatcgcggt cgcggcgcac teg^W ^ cgtg tcgaac ctcagcggct 28800 

tcctccgcgg catcgcgctc ^agtactg gac.oggcat ctgagacaaa 28860 

cgtggctgac cgcggacgaa gcgaggtcgc eg ^ atcgg gtgctcctcg 28920 

ccgtcaggtt ettcgaeggg etcj^ t£a£^ 9 ^ _ 

aagtegggee eggacagacg ^aggaacc ggcgccggac ^cttgttcc 29040 

ogeaegtegt teaagegacg caggggtgee cgtgaactgg getgettaca 29100 

tgeagcagge ggtgggcagg ctct ggaegg c ^ tccgttcgag aagaagcggt 29160 
acagaggega gggcaggcgt ^tgoegc t g tcaagccctg gcgccaaggg 29220 
acctggtcga atcgccccag «cgccggtoc cag 9 ^ tccggcccga tcgC gtgcog 29280 
ccgatgctac tcagcagatg acgacaact tgc 9^ c 29 340 

ategtatccg egageggctg accgaaatcc * ctcg ttgttc ctgaegcagg 29400 

agatcgatcc cgcgctgtcg ^ gttccgcca g ctcttcgaag 29460 

egagtctgeg gttcaaggcg *** tc **™ aca tcgacgc caagctgcct gcggacgcgt 29520 
aggcaccgac gatcgacgeg ^ cgcaagcgtg ct gtccgcgg 29 

ttcctgcgcc ggeggaagee 3 ctc ^ cggccgcgC c cggcacgctg cagcacgtga 296 
gccctcatgc ggccctggct ***** Tactccgcat gct gg gcgga aatcccgccc 29 00 
ttcaacagca attgcagttg «t«o^ a ccgtcgcggc gg co g ac g c g ggactgaagt 29760 
tgctgcccgc gatgocgecg C ^ atgcggc cttcaagccg ategaeggea 29820 

cccgcgctac gacggcgccc tggccgcgct cgccgcctgg atcgatcgat 29880 

aagccagtgc ggagctgtcg ^tggcega gtaccggccg gtgcttgccg 299 0 

acacacggcg cacggccgga tegaagegae gc 99 ctac ccg atcatca 30000 

atccacgctc ggtcgccgga ««acoggc tat* 99 caacgagtac gtggatc tcg 30060 
cC g acC ggtc ggaaggatcg gatcgtggtc gac.ccgtgg 30120 

tg ggcgggtt cggctcgctt ctgttcgg 
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aggaacaact ccaccggggg ttcgagctcg gaccgttgcc gccgctcgcc ggcgaggtcg 30180 
cggcgctcgt caaggaattc acggggatgg agcgcgttgg attctccaac accggctccg 30240 
aagccgtgct ggcggccacg cgttttgccc gaacggtcac cgggcgcgac aagatcgccg 30300 
tcttcgaagg gtcgtatcac ggtctccttg acgaagtgct cagccggccg ctcgtggtga 30360 
acggcgaatt gcggtcggcc cccgcggcgc ccggcatcgc cggcagcgcg gtgagcgaag 30420 
tcatcctgct cgagtacgcg aacccgaagt cgctcgaggt catccgcgcg cacgagtcgg 30480 
agatcgccgc gatcctcgtc gagcccgtgc agagccggcg gctcgacctg cagccgaagg 30540 
agttcctgca cgaggttcgc cgcgtggcgg acgagatcgg cgccgcgctg atcttcgacg 30600 
aagtcatcac cggcttccgc gtgcatcccg gcggcgccca ggcgcacttc ggcgtccgcg 30660 
cggatcttgc cgcctacggc aaggtcgtcg gcggcggcat gcccatcggc atcgtcgcgg 30720 
gcaatgcgaa gttcatggac gcgttcgacg gcggccactg ggagtacggc gacgggtcgt 30780 
tcccggaggt gggggtcacg tactttgccg gcacgttcat gcggcatccg ctcgcgctcg 30840 
ccgctgcgaa ggccgtgctg acgtacatga agtcgcaggg tcccggcctg cagacgcggc 30900 
tggcagatcg gatcgagcgg ctcgcgaacg acgcgcgcgc gatcgtggcc aggcacggcg 30960 
cgccgtatca gattacgcag ttcagctcga tgatgagcct gaacttcccg cacgaccaga 31020 
agacggcggt gctgctgttc ttcctgatgc gcgagcgcgg gattcatatc tgggagggcc 31080 
gtcccttctt cttcacgacg gctcacaccg aggccgacta cgccgccatc ctccgcgcgc 31140 
tcgacgagtc gctggccgag atgcaggcgg ctggcttctt cggcgcgccc gcccactccg 31200 
cattgcaggt gaccgggacg gcggaccagg acgtcctgcc gttcaccgaa ggacagcagg 31260 
agatctggct cggcggccag atgggcgagg cggcatcacg cgcctacaac gaagtcgtcg 31320 
cgctcgacct gcgcgggcct ctgaatcgcg cggcgatgca gcgcgccgtc gacgaggtcg 31380 
tcgcgcgtca cgaatcgctg cgcatgacgg tggccagcga accgctcggc ctgcgtcacg 31440 
ttccgggcac gtcggtgccg gttgaatggc aggacgtgtc gggattgccg gaggacgctc 31500 
agcacgaggc cgtgcagaag gtgctcgagg gcggcgacgg cgtggcgttt gacgtggaac 31560 
gcgggccgtt gctgcgcgtg accaccatga cgctcacccc cgagcatcac gtcttcgtca 31620 
tggcggccca ccatctcgtg tgtgacggct ggtcgttcgg cgtcatcctg cgcgatctgg 31680 
cgacgttcta ttcggcccag gttcgaggca cgcgtgcgaa cctcgaggcg ccgatgcagg 31740 
tgagccgctt cgcgcgtgag gatcgcgagg ccaagcagag tcctgaagcg gccgagaccg 31800 
aggcgttctg gatccggcaa ttcgacaccg tccccgagcc gctcgagtta cccgccgatc 31860 
ggccgcggcc gccagtgaag tcgtaccagg gcgcccgtgt gtccgtgccg tttgacgcgg 31920 
cgctcgctcg cggcgccaag aagctcgccg ccgaacaccg cacgaccctg tttacgacgc 31980 
tgctcggcgt cttcgagacg ctcgtctacc gactgaccgg ccagcaggat ttcgtcgtcg 32040 
gcatcccggt cgcggtgcag cctctgctcg gcgaggatct ggtcgcgcac tgcgtcaact 32100 
tcgtgccgct tcgcgcgcgc gtgtcaccca ccgcgacgtt ctcggagtac ctgaacacgc 32160 
tgaggacgca gtcactcgac ctcaatgagc accagaactt cacgtacggc agtttgctcg 32220 
ccaagctgcg gttgccgaag cacccgggcc gcgatccgct cgtctcgacg agcttcacgc 32280 
tcgagcccgc catcgtcgat cccgggttcg acggtctcac ggcgcgttcg ctgacggttc 32340 
cgcgcgtgac atcgaagcgc gacctgcacg tgaacgtcat ggagatggac ggcggtctgc 32400 
tggtcgaggc cgactacagc accgatctgt tcgacgaacc gacggtgcgc cggtggctgg 324 60 
atagctatcg gatcctgctc gaggccgtcg tgtcgtcccc cggccggtcc ctcctgagcg 32520 
tgccggtgat ctcggagccg gaccggagtc agcttgtcac cggctggaac gacacggccg 32580 
ccgactaccc gcgcgatcgc gtcatccatc aactcgttga agagcgcgcc gtgcagaccc 32640 
cggagggact cgcgctcgtc tgtggtgccg agcggctgac ctggaccgag ctcaaccgac 32700 
gcgcaaaccg gttggcgcat gcgctcgtga agaagggcgt cgcccccggg agcagggtgg 32760 
cgctgtgcct ggaacgatcg gccgggttca tcgtctcggt gcttgccgtg cacaaggccg 32820 
gcgccgcata tgtaccactc gaccccgtct cgcccggcga tcgcaagtcc ttcgtcgtcg 32880 
aggattccgg tgccgtgctg gtcgtgaccg attcgcggtc cgaaggccga tggctgacgc 32940 
ccaggacgcc cgtgctccac ctcgacgccg acagtccgcg aatcgcgcaa gagtcgcacg 33000 
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atgatctgaa ggtcgccctg tcggccgaag acgccgcgta cgtgatctac acgtccggat 33060 
cgacgggLa accgaagggc gtcgtcatcg ggcaccgcca gctcgtgaat tacacgtggg 33120 
cggtgatcga acggatcggg ttacccgccg gctcgagtta tgcgctcgtg tcgaacgtcg 33180 
cggccgatag cggtgtgacc gggctctcgg cgctcgcgat gggctgggtg ctgcacgtga 33240 
tcacctcaga ggtcgcgacc gacgggcggg cgctcgggtc gtatttgacc ^tcacgaca 33300 
tcgacttcct gaagatcacg ccgtcgctgc tcgcgtcgct gctcggcgat cacccgtcgg 33360 
cctcgctgct gccgcgccgc tgtctcatgt tgggcggcga gccgtcacgt ccggcgtggg 33420 
tggalgagct gcagcgtctc gcaccaggct gccggatcat gaaccactac ggcccgacgg 33480 
agaccacggt cggcgtgttg acgtactgga ccggtgaccp cagggatctg ttcgtcgaga 33540 
glgtgccggc cggcaggccg ctcgggaacg tccgcgccta cgtgctcgat -ggccggag 33600 
!gcccgcgcc gattggcgtc gtcggcgagc tctgtatcgg cggcgcgtgc gtggcacgcg 33660 
gg tat!tcaa tcgtcccgac ctcaccgccg agcgcttcgt ccccgatccg ttcgccaccg 33720 
Igccgggagc gcgcatgtac cggaccggcg accgcgcccg gttccgggcc gacgggaaca 33780 
tcgagttcct cggtcgcgcg gactatcagg taaaagtacg cggattccgc ^cgagctcg 33840 
gcgaaatcga agcggcgctt cgcgctcatg agggcgtcga gcaggccgta ^cgtgctgc 33900 
gaaaagatca gcccggcgac gagcgcctga tcggctatgt cacgaccggc ggcggcgcct 33960 
cgatcLgat ggcggagctt cgcacgtcgc tcaagcagag gttgccggac tacatggtgc 3 020 
cggcggtgat cgtcgcgctc gacaagctgc cggtcacgtc gattggoaag atcgatcgac 34080 
gcgccctgcc ggcgccgccc gagcgcgcca gtttcgagtc ggagttcgtc acgcccgcca 34140 
ccgagacgga gcgccggctt gcggagatct ggagcgcggt gctcggtctc gagaagattg 34200 
gcgcgcaia caacttcttc gacctgggcg ggcactcgct gctcgcggcc cgcaccgcca 34260 
tglggtccg cgacacgttt cgcgtggacg tgtcgctcat cacgttcttc gaaatgccca 34320 
ccat!gccat gatggcggca gcaatcgacg ccggtaccgc ggcccctgcc tcccggacca 34380 
tcacgccgag ggcccgcgtc agggcccgca gggatgagct ggtccgccca tgaaccccgg 34440 
ccacccgcag actgagtccc ccgacctgcg gcgcgacggg atcgccgccg aagcgggcga 3 500 
cgtgtttctg gcgccggcgt cgtacgccca ggggcgcttg tggttcctgg ^.gctog. 34560 
cLtgaaagc tccgcctaca acattccgat cgcgctcaga ctgaagggcc gcctcgaccc 34620 
cgccgcgctc gaacgcgcca tcaacaccat cgtccagcgt cacgagacgc tgcggacgac 3 0 
g!tctcgctg cagggcgatg agctcaagca ggtcatctct ccgacgctgc gtattccgat 34740 
ftccctcgtg gacatgcgtc tgctcagtcc tgagaaacgc gaggccggat WW* 34800 
tgcggccgac gacggcgcgc gcgtcttcga tctggctgag ggcccgctgc tcaaggtcgt 34860 
actcgtcltc ctcgagccct ccgatttcgt cctcctgatc accgtgcatc acatcgtctt 34920 
cgacgggtgg tccggcggga tcttcatccg cgagctggcc gaggcgtacg ^ctgagct 349 0 
cgacgggcgc gacccgaagc tgccggagct gccgatccag tacgcggact ttgccgcgtt 35040 
ccagaaggla cagaccgaag ggcccggcct gccggtgctg ttgaagcact ggacgcagga 35100 
gctggcgggg gcgtcgatga cgptcgacct gccgacggat cgtccgcgtc cgcccctgca 35160 
gacltc!Igg ggcgccctcg cctcgcttcg gctcgacgga gcgatctggc cgggcgtcgc 35220 
cacattcagc !ggggcgaga acgcgacggt gttcatgacg cttctctccg ccttcggcgc 35280 
cgtgctgcat cgctggtccg gacaggaaga catcctcatc ggcagcccgt cggccggccg 35340 
cgatcgLcg gagcttgagg gcctgatcgg gttcttcatc aacacgttcg tcctgcggct 35400 
caacgfgtcg gglgatccca cgtttcgcga gttcgtcggc cgggcgcgcc gcgcgtgccg 35 60 
tgcggcctac gcgcatcagg acctgccgtt cgatcggctc gtcgagacgc tcaacccgga 35520 
acggacccgg gatcggcatc cgatctatca agtcatgttc gcgcaggatc cgccggcgtc 35580 
gaagatgacc ttcgccggga tcgagctcga gcggcttgtc gtccacaacc gcagcaccaa 35 
gtgcgacctc acgctcgagg ttgccgaaga cgccgacggc gtcaccctgt -gtcgagta 35700 
cagcgtcgac ctgttcgatg cggcgacggt ggcgtcgctg ctgcagcaat -tcgoogtgt 35760 
g cttlagacc gcgatcgcgg acccggacca gcggctctcc gagctgcgtc tgcttggcga 35820 
fgacgagcgc atgcagctga tcgcactggg gaccggtccg gcggccgtct acccgcgcga 35880 
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cgtcagtctg gccacggtct tcgaagagcg ggtccgcgcg actccggcgg cagtggccgc 35940 
gacgctggag ggacagcacc tcacgtacgc ggatctcaac cggcaggcga accgcctggc 36000 
ccggcggttg aaggcgttcg gtgtgggccc gaacgtgctg gtcggcgtgg cgctcgaccg 36060 
gtcgttcgat ctgctggtgg cgctcctcgc cgtggcgaag gccggcggcg cgtatctccc 36120 
gctcgatttc gctcacccgc aggaacggct tgcgttcatg ctcgccgaca cgcgcgcgcc 36180 
catcgtgctc acgctgcggc gatttgccgg ggcgctcgag tcctttcccc tcttctgcct 36240 
ggacgacgag ctgccggcca ccgccgagga gccggacgaa gatctgccgc cgcagtcgac 36300 
cggagaccac ctggcctatg tgatgtacac ctcgggctcg actggtcagc ccaagggcgt 36360 
cgcgatcacg catcgcggcg tggttcgtct cgtacgcggc acagactacg tccgctggaa 36420 
cgacgtgcgg gcgatcctcc acatcgcgcc gacgtcgttc gacgcgttga cgttcgaggt 36480 
gtggggcgcg ttgctgaacg gcgcgcggct ggtgctggtt cccgagcggc tcgtcagcct 36540 
ggagacgctc gagcgcacgc tccggtcgga gcaggtggac tgcctgctgc tcacgaccgc 36600 
gctcttcaac gccgtcatcg acgcgaagcc cggcattctc gtcggcgtga agcaactcct 36660 
gatcggcggc gaagcgctgt cagtcgcgca cgtgcggcgc gccctggcgc acctgccgga 36720 
cacgcggctc gtgaacgcgt acggtccgac cgagtgcacg accatcgcgt gcgcctacca 36780 
gattccgcgc acgctcgatc cgcaggcgcg gtccatttcg atcggccggc cgattgccaa 36840 
tacgcaggcg ctcattctcg atcgacacgg cgacctcgtg ccgttcggca tcgccggcga 36900 
gctctgcctc ggcggagacg ggctggcgcg ggaatacctg aatcagccgg cgctcactgc 36960 
ggaacggttc gtgcccaacg cgttcagccc cgagccggcc gcccggctgt accgcacggg 37 020 
agacctcgcg cggatgcgcc gcgacggcaa catcgagttc cttggccgga tggatcgcca 37080 
gatcaaactg cgcggcttcc ggatcgagcc gggcgagatc gagcatgccc ttcgtgcccc 37140 
gggagacgtt ctggacgcgg tggtggagat ccgggtcgat gcccacatgg cccatctcgt 37200 
cgcctacgtc gtgcgcgccg atgggagtca actcaccgga accgacctgc gcgaacgtct 37260 
gaaggcgcgc ctgccggaat actgtgtgcc cgcgaagtac gtgttcctgg ccgaggtccc 37320 
gcgaacggcg gccggcaagg tcgatcgctc ggcgctgcct gcatcggcga tcgaggcgcc 37380 
cgcgcccgag ccgcgcgatc cgtcgttcct gaacgaggtc gaagcggagc tcgcgcgaat 374 40 
ctggagcgaa gtgcttcggc tgccctccgt gctcgcgacc gacaattttt tcgatcgtgg 37500 



<210> 2 
<211> 37507 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: DNA sequence 
of clone FS3-135. 



<400> 2 

gatcccacga tcggcgccag cggattgcgc 
gtcacgtggc ggccctcggc ctccagctcc 
gtcaccttgg cgaagcccat gtgccttcca 
tcggcccaga agcgcgtgcc gtccttgcgc 
tgcagcagcg catcgtgcag ctcggcctcc 
tagaagagcg aaaagtgccg tccgatgatc 
gcgcccttgt tccagctggt cacgtggccc 
tgcacgccct cgatcaggag gcgcgagcgc 
tcgcggcgct cggtcaggtc gcgcgtgacc 



agctcgtgcg cgagcatcgc gaggaactcg 60 
tccatgcgct tgcggtgcga gaggtcgcgc 120 
ctcgcgtcgc gaagcgccgt cacgacgacg 180 
acgcgccagc ccgtgtcctc ggcgcggcgg 240 
gggcggccgg ccgcggcatc ctcgggcgga 300 
tcttcggcgc tgtagccggt gatgcgctcg 360 
gcggggtcca gcatgtagat cgcgtagtcc 420 
tcctcgcttt gccgcagctg ctcctcgtgc 4 80 
ttgccgaagc cgaccagcgt gccgtcctcg 540 
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tcgcgcgcga tgtcctcctc gg y y ^^tccacg aacgcacgta gccctcggga 780 
gc gcgatagc ccttcaactg tt«£ago~ ^ gatggcgtcc 

tcgagcacga agatcgcgta gcqaggt tgc gcaccgaggg ccgcggaccg 900 

tcgccggcca gcccgctcgg ggcgccgaag Lgttgcaat cggacaatca 960 

gat gggcttg gcatggctat ^ ^ t acgcagggag ctggg gacgc 1020 
gagataattg cttgatgaag »tgocatttt *M ctctccgcag ggC gccagcg 1080 

tcgaggccca gggcctgtac aagtccgagc gcgtgc g ggc ctggcca 1140 

tg caggtggc cgcccacgac gtgctcaact tctgcgcca. tacggca t g g 1200 

accacccggc cctggtcgcg gccgcgaagg ^ tcg g J gctggag acgcg cctcg 1260 
cgtcggttcg cttcatctgc c t g cttcgac g ccaacggcg 1320 

cgcgcttcct cggcaccgag ^cgcgatcc ^tacgg ct gaaccacg 1380 

ggc tcttcga gacgctgctg ggcgaggagg cttccgctac gacaacaacg 1440 

Ltcgatcat cgacggcgtg cgcctctcga ggccggcgcg cgctacaaga 1500 

acatggcgag cctcgaggcc cagctgaagg ! g cgalcctg ggcgccatct 1560 

tgatcgccac cgacggcgtg ttctcgatgg acggca g ^ gC cgtgggct 1620 

gC gagctggc cgggcgc.ac cacgcgatgg * aagg tggaca 1680 

tcatgggcgc gcacgggcgg ggaacgcccg agca tg gg g acggcgggC a 17 40 

tcctcaccgg cacgctcggc aaggccctgg ctatctcttc tccaacacgc 1800 

agcgcgaggt ggtggcctgg cttcgcaac «*c**~ ggcggC ggcg I860 

tgatgcccgc catcgcgggc gcgtcgctca aggtgc g acgC gcctcg 1920 

agc tgcgcgc gaagctcgcg cgcaacgccc go^-J /^tgggc gaggccccgc 1980 
gcttcacgct ggccggcgcc gaccatccga tcatcc gg ^ ggc ttctcgt 2040 

tcgcgaagga gatggcggac cggctgctga ^fcggcc gcccatgaac 2100 

ttcccgtggt gccccgaggg caggcgcgca tccgcacgca gatg g ttgggagtca 2160 
cgaagcacgt cgatcgcgcc atcgccgcct ttgccaaagt J tcgagtcgcc 2220 

ttgcttgaga accctttcga agacgaagcg aagaagaccg ccatctgcgg 2280 

caagcccgtg gtcggccaca acgacgtgct gatccgcg g g cggtgccgat 2340 

caccgacatg cacatcttca actgggacga ^ ^ tgcgcggcct 2400 

gacggtcggc cacgagtacg taggcatggt ^aggccatg gg ^ actg 2460 

gcaggtcggc cagcgcgtct ccggcgaggg g t gg gcgtga accgccccgg 2520 

ccgcgccggg cgccgccacc tgtgccgcaa J^tto ccgacgacat 2580 

cgcgttcgcg gactacctgg tgattcccgc gcggcg caca ccgcgctctc 2640 

ccccgacgag atcgcctcga tcctcgatcc ggg ccgatcg gcatcatggc 2700 

gttcgacctg gtgggcgagg -gtgctgat «P«W«J« *™ J tgaacgacta 2760 
cgcggccatc gcgcgccacg toggogegeg gtgaaC gtat cgaaggaaaa 2820 

ccgcctggcg ctcgccacga agatgggtgc g fc 4 acgtcg gca tggagat 2880 

cctgaaggac gtgatgcgcg agctcggcat atgaaccaC g gcggcaagat 2940 

gtcgggcgtg ccctcggcct tccgccagat tggacC cagg tgatcttcaa 3000 

cgccatgctc ggcatcccgc ccagcgaggc cgcgatcgac tgg 9 acaagatgat 3 060 
gggcctggtc atcaagggcg tctacgggcg ^/gct tcgacgtgcg 3120 

cgccatgctg cagagtggcc tcg-ootatc - ^cgggaaag tcgtcctgtc 3180 

cgagtacctg aagggcttcg agacgatggg cggC ggcgat cgtgcaggcg 3240 

gtgggattag cgagggccac „.££C oojgtte^ « 

ggttggcagc acgccccggg at ^~ g ^ gctgctcg at gcccaggg cacctcgctg 3360 

sss zzz ss — — 3420 
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„v „o i-hcccaqccg cggccgcctc ggtcgagttc 3480 
cgcgcctccc tgcgcctgga ^cgttccgc tt^ c g caCcgcagc gggc gtggtg 3540 
gcgctctcga tgcgggtgca aggtgagcgc <**£™ acctC gaggc cctggatggc 3600 
Lcgcgggcg caccccacgg c£W£ acagctggtt ctccca gcat 3660 

acgccggcgc tcgagacgct M cct ^ Lcctgggcg ccggatggcc cgggtggcgc 3720 
ttcgaocatg cgtcgcgcac ^gcgtcgag tacctg* g cgcgc tgcgc 3780 

gg gcgcgtgc tggatgcggg -tgoggogac g g ctaccggg t g cttcccgac 3840 

tatgcgcccg agcaggtggt gggcatcgat acaacct gca gttcctcccc 3900 

atcctcgagc gccacgccct cggccacctc gcgctgc gg ctcgtgg tcc 3960 

gcgga tgcca acgcgctccc W ccgaggcgcg gcgggtgctc 402 

gcggtcgagc atttcgtggg <^ a «*« aaqctct act acaccgcgca ccacggccat 4080 
aagcccggcg gcctgctggt catccatccc ^ ctct tgaagtcgcc cgacgaggtg 4140 
cacctcggcg agtacagccg cgagcccttc ttccac W ggcggcat cgcgcc gacg 4200 
cgcgacatcg tgttcggcgc ggacgtgagc ^ccacg gcag ttc g a g 4260 

cgcgcggagc actggcgctg gtacaacgag cact gcgcgc cgagccgctg 4320 

g acgagttgc gcgcgctcga cttcgagccg agctg ggcgt ctcc g a g ctc 4380 

gtg gagtaca ag cc g gagct ^tgcgctac c ^ aM ggac gcatgggcag 4440 

tacgtggcct gcatcaaccg caagcccgtg tcagatgg tt gttatccc 4500 

gcggcattcc ggtaaggtcc gccgggacgg ccccgatgaa gacccagacc 4560 

Igtcactccc gg act g ccgg ^tcaccctc -agg^aa ccgcg cc g aa 4620 

caggccgtgc tcgccgcggc cctgctgtcc *tg^£ cctcgagcgc ga a g ggcagc 4680 
gccgtcacgc tgacgggcgc ^cgaagtc ccgco^c ccgt caccggcat g 4740 

gg cacggtgg tcgtgaaggc cgattgcacg c Lacgcggg cgtggcagtc 4800 

Iccgccaccg cggcccacat coacg. cgggcgccaa gatgaccgag 4860 

ccgttcgtga agaccgcgga cacacggttc gagg g cgaggcccac 4920 

gcgcagtgcg ccgcgtaoaa ggcgggcggc acctac* g J caaaaaga aa 4980 

Lgggcggcg aggtccgcgc = ™££ g ggggcgaaag gatt cgaacc 504 
aagcccgcgc aa gcgggctt tttcgttg ^ cgc tac g ccc cgaccgaact 5100 

ttcgaccccc tgcaccccat gcaggtgcgc taccagg^tg * gggct tc g gg 5160 

tgaLttata cctgcatcgg ggtcgcagcc Lggctacat cggcacccac 5220 

tagcatccgc ccatggcgac gaccctcgtg g/atcgacaa ctactccaac 5280 

atcctctgcg cactggccca ggccgggcga Lggttgcgt cgaggccttc 5340 

agctcgccgc gttccatcga acgcgtgctg tcgcag ggcg cgacgtggac 5400 

g g acg t g gaca tccgcgacgc o^o^o cgcaa,g ^ ^ gcccgagcgc 
agcgtgatcc acctcgcggg actgaaggcg g ™ cgctc gc cgactcgccg 5520 

tLcacgaca acaacgtccg cggcaccgag agcc^ gg^g 

gtgcgcaagt tcgtcttcag ct^« * gccagaacaa gctcgacat c 3640 

atcgacgagg acgcgcccac ctcgccgcag ag * ggcgcg tagt gaacctgcgc 5700 
gag acatgc tggtcgccct cgccaagcgc g c ga ggaccc ggccggggtt 57 60 

Lcttcaacc cggtcggcgc acacgagagc gg ogactgaa ggagctgagc 5820 

cccaa caacc tcatgcccta cgtgtgccag f gcgcga ttt catccac.tc 5880 

gtctacggca gcgactaccc caccccggac W™^ » ggc cccc g gg 5940 

t g cgacctgg ccgaaggcca cgtggccgcg ccgtg ctgca attgctcgaa 6000 

accgtcctca cggtgaatct cgggacgggg ctc f Ct tcgtggggC g gcggccgggc 6060 
aC cttcgagc gcgtgaacaa actaagcatt J ccgt gctggg ttggaccgcg 6120 

ga cgtcgcga tctgctacgc gaacgcgggc ggcagga aag gaatgcaaag 6180 

aagcgcggaa tcgaggagat ^gccgcgat 9 cttgctg caa tgc.gcatga 6240 

sss caggacgtca " c 
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ccggcgttgg aaggaagaag g c a acc - S= 
ccgagtaaga atcgtcagtc ^gaccaact cgcct ggg * ^ cggcgctacg 6 480 
agC aacacc t cgcatcgcgt ^ fccgtcaccac cgtC gatccc 6540 

cgogaatccc tctggccctc gctcct eg g ggg ccgagcccgg caccagctac 6600 

gcgcgctggg egaaggagee S tacta ~^ -.LLttog aegegtegtt cttcggcatc 6660 
tcgttcgccg ccggcaccat cggcgacatc tg 9 ctgctcga gc tc t cct g g 6720 

tccccgcgcg aggccgcgca gatggatccg -g^cg * ttgcggcgtg 6780 

gaagccctcg aggccgcggc ^gaagece q J t l cgacc t ggcgtccatg 6840 

ttcgtgggcc tgtccaccgt cgactacagc W ccaaccgcc t ctegtattte 6900 
gaegegtegt cggccaccgg «««ccgcg ^f^l gctcgtcctc gctggtggcg 6960 
tacgacctcc acggccccag catggtcgtg gataegg g « ggtg ggcggc 7020 

ttccaccagg cctgccagtc ^tcgccatc ggegagaege ^ ^ gatgctctcg 70 80 
atcagcctgc acctgcatcc gtacgggttc **** gctacgtg c g cteggaagge 7140 

aagegeggog cctgccgcgt ^tcgacgcc a£WO£* g ccgacggcaa cccgatcctc 7200 
gcgggcgtgc tggtgctgaa „£tog~ ««« caccgtgccc 260 

gccgtcgtcg ccgcgagcgg cgtgaatacc gcggg * gg cat cgacccg 7320 

agegtcgatg cccaggctgc attgetgteg ^f f ^ tgg t g ggcga tccgatcgag 7380 
tLgagatcg actacctcga ggcccacggc gccacclgcc gctgccgatc 7440 

acccgcgccc tcggcatcgc gctcgggcgc " gggca tggc gggcctggtg 7500 

ggt tc g gtga agagcaacgt cggccacctc ccatccacct cgattcgccc 7560 

aaggcgctgc attgectcaa gcaccgcgcg ^cg gc tcacgctc 7620 

aacccoaaca tccacttcga cg.fltgg«c ctgaagc gg cgccaacgcg 7680 

catccgtaca agaagctcgt cgtcggcgtg «*bog« g g ggccggcgag 7740 

cacgtgatcc tcgagagcac ^ccacttcg <*»»**™ gcgacgtg gc ggcctcgtac 7800 
. gtggcgctca tgetgagegg ccacgacgac aC atcgcgta caccaccgtg 7860 

gcgaacttcc tgeaggegag cgagcaaccc ^tgcacg cgcca gcatc 7920 

cactgccgcg actggcacca gctccgcgcc ^ggcgtcgt gageggcegt 7980 

gC cggcgccc tegegcaatt cgccaagggc ^Lggcgc gcaatggccg 8040 

gcgctgccca acgcgtccgg ccccgcattc ^gtattegg * gcaat cggtc 8100 

gcaatggggc g cgccctgct cgccggcgag agatg^tgga gcaggaggtc 8160 

gaegegctgt tccgcccgct cggcggctac tgctctt cgc cgtgcaggtg 8220 

cccgccgacc agctggaccg caccgagatc ^ZTc cgcgcgtcac cggccacagc 8280 
ggcttgaccg agetgetgeg ccactggggc *JJ tgcggcgcgc 83 40 

gtgggcgagg tcgccgccgc ttgggcgagc gcgcg ggcgc gatgaccgcc 8400 

g tgatctacc accgcagccg ctaccagggc 9 j£*tt gaagggegag 8460 

gtcggcctcg ccgaggacgc cgcgcgccgc tcgccggcac gcgcgccggc 8520 

gtgaaegteg cgtgcatcaa cagcacgcgc aacgtgacgc J cctcgacc tc 8580 
ategaggege tegaggcega gctcacgcag cgcaag* gt gagcgC gctg 8640 

gaetacgegt tccacagcoc ggcgatggac -g^cgeg J™ cggctcgccc 8 700 

cgcggcctca cgccgcgcgc cacccgggtc alltccgcga gccggtgcgc 8760 

g ccgcgggca accagctcga cgccacctac atgtc ttcat egagategge 8820 

ttccaggccg cgatccgcgg catcgccgag ^gtca ^ gatcga gggc 8880 

ccgcacccga tcctcaagaa ctacatcaac * ggcga tccg cgccgccgcc 8940 

cgcgcgctgg tcacgctgca ^getoggeg ^ctccc cgcgcagggc 9000 

caggaagtcg tgatcacegg ^cccggtg g l cactgqcg cgcg cccacc 9060 

cacttcgtcg agctgccgcc ctacccgtgg ^ ct atcgcctg 9120 

tcgcaggcct acgacctcat ccagcacggc ^cagc ^ ggcgtacgcg 9180 

cacgagaaog acttccagtg ggaaaaccac atcgacac 
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gaccatgtcg tcggcggcgc cgtggtgttc cccgcggccg gcttcgtcga gatggcgctc 9240 
gccgcctcgg cgatcgcgct gggcggcgac gcgcacgaga tcgagacgct cgagatccgc 9300 
agcgcgctgc tgctcgagga ctccacgtcg aagaccgtgc gcttcgcgct ggaacccgat 9360 
ggccgcttca caatccgcag ccgcgctcgc ctgagcgagg atccgtggca actccacgtg 9420 
gtcggcaagc tggtcggcac gcccaccgag cttccgcgcg cgccgcgcca gtcggcgccg 9480 
tcgcgcaagc ccgatgtgat cgcggcccag cactacgaag gcgccgcgaa ggccggcctc 9540 
gcctacggcc cggccttcca gtcggtctcc aaggtgtggc tcgacgccgc ggacccgggc 9600 
agcgcctacg cgcgcctgat cctccccaag ccgattcgcg gcgagctggg cgtgatgcac 9660 
ctgcaccccq cgtcgctgga cggctgcttc cagctgctgg tcgacctgct gcgcgccgag 9720 
gcctcccggc acgcgcagat cgccttcgtg ccgatcctgg tcgggcgcac gcgcctctac 9780 
ggccctgccg ccctcgtcac cgcctgccgc gtccgcctga ccgcgcgcag cccgcgctcg 9840 
ctggtcgcgg acttcgagct ctacggcccc aacggcgacg tggtggccgc gctcaccggc 9900 
gtgcgcttcc gcggcgtcca gctgaagcag ccgcagggcg cgcgcctgcg ctacctgtcg 9960 
taccgcgcca tcccgcgcct gaatggcggc gacacgccgg gcgccaagcc gctgcccctc 10020 
gcgttgctgg ccgcggcgtg cggcgagcgc tggcattcgc tctcggccgg ctcgcgcaag 10080 
cgctactacg atgaggtcga gccgctctcc gacgtgctgt gcagctcctt cgccgagcac 10140 
gcgctgcgaa agcggatcgg cgaggccccg ctcatcgacc cggccgcgct cgccgcgtcg 10200 
acgccggccg agcattggcc gctcctcaag cgcgcgatcc agatgctggt cgaggaccag 10260 
ttgctcgagc ccgccgaagg cggctggcgc tggagcccgg tcaccgaatt gcccgacgcg 10320 
caggagagct ggatcacgct gctgcgcgac tacccggacg aggccggcca gctgatcgcg 10380 
ctcgggcgcg ccggcgctca cctcgcggaa gtgttcgccg gcaaggaagc gtccttgccg 10440 
agcctgttgc cggccacgca ggcgaacgtg ttcgcgcgcc tgtgctcggg cgcacccgcg 10500 
ttcgccaacg ccgggcgcgc catcgccgac acgctggcgc tcgccctcga gcgcctgccc 10560 
gcgcaccggc cgctgcgcgt gctcgaggtg acgccgtgcc gctccgagat cacgctgcgc 10620 
gtcctcgagg ccgtggacct cgaccgctgc gagtaccacg tcggctgcac cacggacgaa 10680 
gcactgggcg agtacgaagg cgtcctcgat gccattccgc acgtcgagac ttgcgtcgtg 1074 0 
gacctgaaga tcccgggcat gggcctgaag gcgccgaacc agggcccgtt cgatgtcgtg 10800 
atcgtttccg atggcctgct cggtgcggcc gatccggatg cggccgcggc gcacctcgcc 10860 
ggcgcgctcg cggaagacgg gctgctggtc atggtcgcgc agcatccgtc gcgctggacc 10920 
gacctcctct tcggcgcgga cccggcgtgg tggacgctcg ggccctcggg ctcgcagcgc 10980 
tcgcgcctgc gttcgccgga cgagtggcgc atcgcgctgg cccacctcgg cttcgggccc 11040 
accgtcgcgg tgcccgagct gcccggcctc gagcgcggct cgtacgtcct catggcgcgg 11100 
cgcgacggcc cggtgccgca ggctgccgag cgcagcgttc ccgccaagtc gggctggatc 11160 
ttcgtgcagg aagcgggcgg ttattcggcg gcgttcggcg cctgcgtgca ggcggatctg 11220 
cgcgcacgcg gccacgcggt gttcgagatg gcgcccgatt ccgtcgaggg ggccgagatg 11280 
gcgcagcttg cgcgcgcgcg ccagacgctg ggcgacatca gcggcatcgc cttcctggcc 11340 
ggcctgcccg acgacgcggc cgccacgggc gacgcggccg agcgcctcgc gcgccagacc 11400 
gcgcgctgcg ccgcgctcgc gcgcctgctg cgcgaatgca aggcgagtgc catgacaccc 11460 
gcggtgtgga tcgtgacctc gggcgcggcc tccagcctgg cgcgcaacgt ccccgccctt 11520 
gcctcgcgcc gccgccctcc cgatccggac gaggcgatcg tatgggggtt cgcgcgcacc 11580 
gcgatgaacg aattccccga gctgcgcctg cgcctcgtgg actgcccgaa cccggagaac 11640 
ctcgagcgca acgccgcctc gctggtgcag gagctgctct cgcccaccgc cgacgacgag 11700 
gtggtcctca ccggcgccgg ccgctacgcg ctgcgcctgg gcgcggtgcc gccgcccgca 11760 
tccgtgcgca ccgctaccca gggcatcacg cgcctcgact tcgccgcgcc cggcccgctg 11820 
aagaacctcg cgtggcgcgg cgacgtgcgc cgcaagccgc gcgcgaacga ggtggaaatc 11880 
gaggtgcgcg ccgcgggcct caacttccgc gacgtgatgt acgcgatggg cctgctctcc 11940 
gacgaggcgg tcgagggcgg cttcgccggc gcctcgctcg gcatggagct ctcgggcgtc 12000 
gtcaccgccg tcggcaagga cgtgacctcg gtcgcgcgcg gcgacgaggt gctcgcgttc 12060 
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gcgccctcgg cgttcgccac gcacgtggtc accaccgccg attcggtcgc gaagaagccc 12120 
gcSgctgga cgttcgagtc cgcggccacc atcccgaccg cgttcttcac cgtgtactac 12180 
tcgttgalgc acctggccca gctgcgcgag ggcgagcgcg tcctgatcca cggcgccgcg 12240 
ggfggcgtgg gcatcgccgc gatccaggtc gccaagtggc tgggcgcgga gatcttcgcc 12300 
Iccglggct ccgacgagaa gcgcgacttc gtgcgcctgc tgggcgcgga ccacgtgctc 12360 
gactcgcgca cgctcacgtt cgccgacgac gtgctgcgca tcaccaacgg cgagggcgtc 12420 
oacgtggtgc tgaactcgct ctccggcgag gccattgcgc gcaacctgcg cgcgctgcgc 12480 
ccgttcggcc gcttcatcga gctcggcaag cgcgactact acgagaacac gcacatcggc 12540 
ctgcgcccgt tccgcaacaa catctcgtac ttcggcgtcg atgccgacca gctcatgaag 12600 
glgcggcccg acctcgcgcg ccagctcttc aacgagctga tgcagctctt cgagcagggc 12660 
gtgctctcgc cgctgcccta ccgcgccttc gccgccaccg aggcggtgga agccttccgc 12720 
tacftgcagc actcgcgcca gatcggcaag gtggtgctgt cgttcgccga cggcgtgaag 12780 
acgcagcccg cccggcccgc cgtgccgcgc gagctggcgc tgtcgccgaa cgccacgtac 12840 
ItcgtLccg gcggcctgtc gggcttcggc ctgcgcaccg cgcagtggct ggtggacaag 12900 
ggtgcccgc! I c 9 ct gg t gct ggtctccaag agcggcgccg agtcgctcga ^ 
gcggtggccg acctcgaggc ccagggcgcc acggtgatcg cggaggcgtg cgacatcacc 13020 
gaccgcgcct ccgtgcaacg gctgctcgcc gaggtcgcgg ccgcgctgcc gccgctgcgc 13080 
ggcgtgatcc acgcggcggc cgtgatccag gacggcttca tcaccaacat gacggccgcg 3 
Sgatccgcg acgtgctggc gcccaaggtc ctcggcgccc gccacctgga cgagctcacg 13200 
cacggcgccl agctcgattt cttcgtgctg tactcgtcgg cgacgacgct gttcggcaac 13260 
ccgggccagg ccaactacat cgcggccaac tgcttcctcg aggcactcgc caggcgcgc 13320 

£— = r.= = == 
ss= = s= = :: = 5 ss 

:= ss= = = : = 

!cgctgcgcc gcg.ggtggg cg.gatcct, cgc.tcccgc cgg.ccgcct cg.t.cgcgc 13740 
cagcccttgc aggaaatggg catggattcg ctgatgggc, tggagccgct gaccgcggtg 13600 
Taggcgcgct tcggcgtgaa cctgccggtg atgtcgctct ccg.gc.gcc gtcgatcg.g 3860 
aagctggtcg accgctcgt gcgcgcgotg ..gg.cccg. .cgccggcac cg.ggccga. 13920 
"gcgcagcg accagatcg. gcgcgtcgcc gcgc.gt.cg cgcccgagct cgactcgcgc 
caggtcglgg .gctccgc ggcggtcg.t g.cgcgc.g cc.cgc.gc. ggg.cgtccg 1 040 
tgaaggcgc, caacctccgc ggcctctcgc aggcagtga. gg.gc.gctg «cc.gcgcg 14100 
tgctcgagca gcgcgtgcgc cgggtcg.g. agg.c.gcc ctcggcggtc g.gtccgcc. 1 160 
«,acgcctt ccgccccg.c cccg.tgccc tgggccct ctcggagcgc tactgccgct 1 220 
tcgacatgc cccggtctac cagcagatgc .gctggtg.. ggaaggtgcc gcg.agctgg 14280 

— ;r.rr.rc =s sss sss 

c cggc cTa c gccacc, acgaatacgg g.ccctcggt ctcggccagc 14460 
HSSSt ccg 9 gcg.g= 9 S'cgctgcac cgcg.gctg, .gcgcgccat cgccg.gg c 0 

==: r.= s= :::: u ir 

~~ :;i is rs 5352 ;ss 
r.: fg : 4,^4 c^gc «». 

gagggcg y aataatq gat gaagegcact ccttcggcgt gatgggcccg 14880 

cgccgccacc gcgcgttcct gatggtgga*- y y y „ a 4- r f„ na ta 14 940 

cgcggcttcg gcatccgcga ccacttcggc atggccggcg aegaggcega catctggatg 
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ggcacgctct ccaagacgct cgcctcgtgc ggcggcttca tcgccggcga gaaggcgctg 15000 
gtcgagcacc tgaagttcgc cgcccccggc ttcctctaca gcgtcggcat ggccccgccc 15060 
gtggcggccg cggcgctcgc cgcgctgaac cgcatgcgcg aggagccggc gcgcgtggcc 15120 
gccctgcagg cgcgcggacg cttcttcctc gaggccgcgc gcgccgcggg catcgacgtg 15180 
ggcctctcgc agggcatggc cgtggtgccc gccatcaccg gcagctcgat ccgcgccggg 15240 
cgcctggccg aggcgatgtt ccagcgcggc atcaacgtcc agccgatcgt ctacccggcc 15300 
gtgcaggaga actcggcgcg cctgcgcttc ttcgtgagcg ccacgcacac cgaggagcag 15360 
ctgcgcttca cggtccgcga gctcgccgac gcctggcgca agctctgagt ggcgggcccc 15420 
cggctgcgcg tcgtcgtctg cacccacggc ggcctgaacg gcgcgctggt gctctcgcgg 15480 
ctgctggccg cccccacgct cgaagtctcc gcgctggtga tctccagccg cgcgcgcggt 15540 
gcccatgagt cgtatggccg tgccgcgctc ggatacgtgc gcgcgagcgg cgtcgcctat 15600 
gcgctgtacc tctggtgcgc cacggcgctc gccgacctgc tgctgcgcgg cacgtccgag 15660 
ggcccggtcg cgcgcatcgc cctcgcgcgc ggcatcccgc tcctcgccac gccgcgcgtg 15720 
aacgatgcca ccgcgcgcgc cttcatcgcg ggggcggcgc cggacctgat cgtctccgcg 15780 
ttcttcaacc agcacatcga cgccgacgtg gctgcgctcg cgcgcgtggc cgcggtcaac 15840 
atccacccgt cgccgctgcc gcacttccgc ggcgtggatc cggtgtcctt cgcgcgcctg 15900 
cgcggtgccg agcgccacgg cgtgagcgtg catcgcatcg aacccggctt cgataccggc 15960 
gcgctgctcg cccaggaaac cgacgtcgag gcccccggca gcgtctttgc cgccaccgcg 16020 
gcgctctatg accgcggggc ggcgctcctc gccggccgtg ccgcggcgct ggccgccgac 16080 
ccgcgcggaa ccccccagcc ggcggggggc tcctacgact cctggcccac ccgcgcccag 1614 0 
gtcgccgcct tccggcgcgc cggcgggcgc ctgctccgcg cccgcgacct gtggcgcctc 16200 
gcgcgccggg ggccggccgc tttcgtaata gaatcagcgc ggtagcggcg actcacgccg 16260 
cgcttttccc cgaccccccg aggcccatga aacactggct gaagcaacac gcaatcttcg 16320 
cgctcaccgt cctgctgccg accgtggccg cgatcctcta cttcggcctg atcgcctccg 16380 
acgtctacat ctccgaatca cgcttcgtgg tgaggagccc ccagcgccag gtgcagaccg 16440 
gcctggtggg cgccctgctc tcgggcaccg gcttctcgcg ctcccaggac gacacctact 16500 
cggtgcacga cttcatcacc tcgcgcgacg cgctgggcga gctggacaag aagctcgccg 16560 
tccgcaagct ctacacggcc gccaacatcg acttcatcaa ccgcttcccg gggctcgact 16620 
gggacgacag cttcgaggcg ttccaccgct actaccagaa gaaggtcacg atcgacttcg 16680 
acaccgcgtc ctcgatcacg gtgctgcgcg tgcgcgcctt cgagaaggcc gactcgcggc 16740 
gcatcaacga cctgctgctg cagatgggcg agcgcctggt gaacgagctg aacgagcgca 16800 
gccgccagga cctgatccgc ttcgcgcagg ccgaggtctc gctcgccgag gacaaggtga 16860 
aggatgccgc gctggcgctc tccgccttcc gcagcaacca gtcggtgttc gagcccgacc 16920 
gccaggcctc gatccagctg cagggcgtgg ccaagctcca ggaggagctc atcgccaccg 16980 
aggggcagct ggcacagctg cgcaagctct cgcccgacaa cccgcagatc ggcgcgctcg 1704 0 
agaacaagtc ggcggcgctg cgcgtggcga tggcgcgcga gtccgccaag gtgaccggcg 17100 
gcagcggctc gttcagcgcg cgcgccccgg cgttcgagcg cctcaccctc gagaagggct 17160 
tcgccgaccg ccagctgggc gttgccctca ccgcgctgga gaccgcgcgc agcgaggcgc 17220 
agaggaagca gctctacctc gagcgcatcg tgcagcccaa cctgcccgac gaatcggtcg 17280 
agccgcgccg catccgctcg atcttcaccg tgttcgtgct gggcctggtg gcctggggcg 17340 
tggtgagcct gctggtggcg agcgtgcgcg agcacgtcga ctaggccgtg gagcagaacc 17400 
cttccctcct gcgcgccctc ggggtgcagc ggcgcgtgct ctacgcgctg ctgatgcgcg 174 60 
aggtcatcac gcgcttcggg cgcgacgacc tcggcgtgct gtggctggtg gtcgagccga 17520 
tgatcttcac gctcggcgtg acggccctgt ggaccgcggc cggcatgaac cacggctcgt 17580 
cgctgccgat cgtggccttc gcggtcaccg gatactccgc ggtgctggta tggcgcaact 17640 
gcgcctcgcg cgcctcgatg gccatcgagg ccaacaaggg cctgctcttc caccgcaacg 17700 
tgcgcgtgat cgacgtgttc gtgacgcgca tcatcctcga gatcgcgggc gccacggcgt 17760 
cgttcaccgt gctgggcatc ttcttcatct ggatcggctg gatgccgatg ccgatcgaca 17820 
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ccatcggcgc cggcaccgcg tactcgctgg ^ gtggctgccg gaga aattcc 18000 
acct g ct g tt ccc g ctctc g ggcgccgcct ^ ^ctact 18060 

ag cagttcgt gctcttgctg cogatggtgc ^ ggcaacggt c tc g ct ggg cc 18120 
tcggcaacgc ggtgcgcacg cactacgacg tgg***«t ggc gg g ^ 

tgtcg ct gCt cggcatgcac ct g= tggow-" *™ * 18240 
ccgggtcgag aacgtgacca gatgtactca cgcgcct gg gcgccggc aa 18300 

cagcttcgag ctcggccgcg gg cgcaacat cggcatcctc gg g tgcg 183 60 

gtccacgctg atccgcctga tcagcggcgc cgag ac ggg ^ 

cgagatgagc gtgtcctggc cgctcgcctt ^172 ^cgcg acaaggtcgc 18480 
gctggacaac ctgaagttcg tctgccgcat ^acggcgtc g egg tccactactc 18540 
ettegtcgag gacttcaccg ageteggegt ^ggtggagt tegactgett 18600 

geaeggcatg atcacgcgcc tcgccttcgc ^tcgatg g gg ^ ga 18660 

cctgatcgac gaggegatgg tggtgggcga lJj cq l cg ccaaggtgat 18720 

gctcttccac aagegcaagg accgcgcctt catc ^ cqqctgctg c ccttccccac 18780 
caagctctat tgcgagagcg 

cgtggacgcc gectacgagt tctacatgaa cgagg g ggC gttcgac 18900 

tgcctgaacg cccccgcgcg gccgctgcgc ^ctgcg catgatcgag 18960 

gg ccacgccg gcatcccgca ^ggcaege tgctcgccaa gggcctgccg 19020 

ggcatcgacg tggagggect gctgcagcac a ^ggccacg * J gcg cgtcgtg 19080 
ccgcgcggcg gcggcgacct cgcaccggac ^ ^ Jcattcgc 19140 

gt ctccacca agcaggagct oaccaacgcg cggcgcactt ccgcgacttc 19200 

cagctgetgg g c g c g c g c g a «aooctct.c W ^ acccgc 19260 

gtctggcagg cgctcttcgc gcgcacgctg cac 9ccgc gg * gctggtcacg 19320 

g gC c g gctt t c gcgtcgc g c g cgtgccgtgg ^ cgt'gctgatc 19380 

cgcaagctcg gctacacgct gcatccgcgc atcgacactt C J gcgc taccac 19440 

/.egagaege cctacccggc gcgggtcgcg ,.eggo£o « ^ ^ 1950Q 
gacgcgatcc cgctcctcat gccgcacacc atctccgaca cgtctccgaa 19560 

g cactacc ggg c g ct g c g cc g caacgtggcc togggogogo * ^ ^ 

gecacgcgca aggacctget c , ga gtc ^ ^ 

cacaacatgg tctcgcacca ctacttccgc g a 9 g y ccaacggagc cgccaccgcc 19740 
at cctgcgaa cgcgcgcgag cgagcgcatc cggctacctg 19800 

gt cgcacgat cgccctcctt cctcgcgcgc gaetgatege cgcgtgggag 19860 

ctcgccgtgg cgacgatcga gccgcgcaag aac cacgcgg Lgtcggcat gcccggctgg 19920 
ca cctgcgca cctcgcgctt cccgggcctg ogoctggtW ^ gctcttcgtg 19 980 
gg ctac g a gg cgctggtagc caagttcaag ^J q tgccacggtg 20040 

ctcgaggacg tgcccgcccc cgagc . ££££ ctgcggC agc 20100 

tgcccgagct teggegaggg cttcgatttc tec accgcgacgc cgccgagtac 20160 

ccggtgatcg ecteggagat cgccgcgcac C g cg aggt^ accgeg g^ ^ 20220 
tgcagcccct attccgtggc cgacctcgcc ^ c ^ cg aqq tctcgca gcggtacacg 20280 
g c g acgggcc tgcgccaggc gctggtcacg cgegg a^ g gg ^ ^ 

cccgaagcca tcctgccgca gtggcgcgag tacctgctcg g gg 9 ^ 20400 

ecgtgagcga tccgaaggcc accgacc g ^gggagga gtgggg^ ^ 
a cttcggcgt gatcaccaac ccgcgcttcc gecg ggg gtgcacgcgc 20520 

gegaattect cgcgtccggc cgcgtccatg ccgactacgt *J£ gtgggccgcc 20 580 
acatcgcgcc cgacttccgt ccgcgcacca t-tcgattt gg g tccccctcga 20640 
tcgtcatccc cttcgccgcc caggccgagc aggtgacegg eg g g 2Q700 
tgctggccga ggctgcgcgc aactgcgccg ■ ageagggegt ggecaatgcg eg 
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tctcggacga ctc.ctcacc gggCtg cc gg = = cac 2O7J0 
tgttccagca catcgatccc, gcgcgcgggc W«tctt c g g tacgcggcga 20 880 
t g ggcg a g c gg c ggCgtggg c gcjttg-ct togt£*» cagcctccac 
cgtacggcgt ggcgccaccg ^cgagccgc cgcgc t g gcc acgcaggcac 21000 

ccagccgggc gg a g atcaa g gccgcgagca * * tacccggcgg 21060 

gggcg cc gg a acc gg caccc aatcccgacc aC at g ca g^ ^ m20 

cgg a g at g ct gttcctcgtg gcgtc g g tccttcgagg 21180 

accacggcgg c g a g ct ggg t ctcttcctct tcttocg g gccgccgcgg 21240 

cgaaggagag gcccagcgcg ^atcgtcg c f c f tacagg cgcacgcgcc 21300 

ccgcgacttc ggcagccacg eg<*eg«ggt * tcccagcg c g cgctcgacct 21360 

agtagtgcac ccagtaggag tccggacagg ttgcagg ttc acggagggct 21420 

cgggcccgac g c g cccct gg tgcettgjt «oW^ cg ; c cgcc tcatc g cc g t 21480 
cg a gggg ca g g a Cgg c g a g c «^cctcgc « aagcgcgtg c gcggccgcga 21540 
cgtcggcgag gcgcatcagc gtgggcaggg caggcgggcg cggtggtagt 21600 

gcgaatcgcg cagcatcgca g c g tc g cc g c ^gcgagcgc c ^ cttgcgtcgc 21660 
ggg tcca g aa c g aa gg c g cc g c gg catc g a OCflogoW g ^ acgc 21720 

ccccggagag ccgggacgcg .cccacaac Wttcgeotc cgj^g g ^ 21780 

tccgcaacgg cgccagcacc gcggtggega atgttcggcg cagagccgcg 21840 

ccgagcgcgc g aact g catc agcgattcgc cgagcga g tcaccgcgaa 21900 

cgggatccac cgcgccgagc gcgcgggcgg ^^7ac cgacatcgac gccgccccgc 21960 
gcg c g t g ca g ccgcaggcga t gg aac g c g g cgcgatg t g c a gg at gg cec 22020 

g ctcca g c g c accctc g ac g Scgccatcg, g^cgggtog aggaccgcga 22080 

agg tc g cct g cgcgccgggg aggttcgtgc Lggccgatc agcgccccgg 22140 

gcgccgcggc gg c g tc gcg c gcgccactga agtcg gg ccccaccgcc gcgaggtcg c 22200 
cgagctcggg ggccgcgcgg aaggcggccg ^ gaacgggtcg gggctcgcga 22260 
cgcgcagocg gtagaggcgc gcgcggtggt «*«gcooca g W acctggt gc g 22320 
tLgtgaatc gaacgagcgc tcgacctc.c g-goggtgg. 22380 

cgccccacag ggtcgccgca tgcgcggcc ££££ SLgW* ttca.ccgct 22440 
tcagctcgac gtcgccgaag cgccgctcgc gcaacgcg g tgC g C gg C ac 22500 

cgaggctcgc cttgtactc. c g ata gg cat tatgcg ta gg 22560 

tgtcg tccat ca g c gagg cc W^tg ggj gcgcggtcgc 22620 

ccgcgcgatc gtcgtagggc agcgtggccg cgcgcag g tcg ccc ggg c 22680 

ggaatcccag cttgcccagc acgcggggtg * ^ ta g agctg c g 22740 

tccagtc.c. ca gCg cc gg c ac.ttc.agt agagccac ^ ^ ^ 

ccaggtccgc g tcc gggCgg occtgcaoct cgccagca cg tagggccgcc 22860 

cg t gg cc gg c ggagaagatc ggcacgtggt. g L gg tttc g gccatcaccg 22920 

cgcgcacctc gagacgctcg accagcgagc gcagcgcg g g * g g C g C cac g t 22980 

cg ca g tactt g tc g atc g c g ccct gg c g ca ^gccoggc * gagac g C c g a 23040 

agcgcggctc ggcgcccttg tcgcccaggt ^f C JJ * atc acggc c g aat 23100 

acacgaagcg cggccgcgaa tcgtcgagct ^ogtcggog t gg aac g cc g 23160 

ccgcgatgcg ctcgaactgg cgcggcgcat ccagcggtgg aac gg at gg a 23220 

cgaagccgag cttcttgtac ac gCgg tc g c ^tgtagaa acccggtcgc 232 80 

tggccacgct gcggtagccg tggcggcgca gcatgcg gg ^ atgccggtga 23340 

gcaactgccc catgtagggc agcgcgcggc * aggcgg ccg g 23400 

gg atctc g aa ct cgg t g tc g caggtgttgc cgccgatc c g >igaflgoW 23460 

tggtgoggcc cggccgcgtc gcgccg ^ ££^y£ gatgatg tc g ggg c g ctcc g 23520 

-~ ;r g r g r ct a*-**- 2 35 80 

cggtggccgg gccgctcgcc gciy^y 
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gcagccgccg cggcgcgacc gggcgggcga tgctcccggc gaccagcacg gcgtgcagga 23640 
cgaagcccag gcgccgcacg ttcacgttgt ggtcggtggg cgaatacgtc atcgcgtcga 23700 
actccaccgg ccagcggcgc aagcgtccgg cgacgaaggc gacgccgagc gcggcgagca 237 60 
cgagcagcca ctgccggcga tcgtcgtacg cgtcgtgcag gtcgatggcc tcggtgatct 23820 
cgaacgcggc ccacgccagc gccgcgagca ctgcggcagc ggcaagcgcg cgcaccgcca 23880 
cgcgcgggcc atgcaggtac gccggcgcga acaccagcgc ctcgcgagcg ctcaggaaat 23940 
cgcccgggcc gaccacgcgg ccgaagaacg cgagcaggcc tgcgttcacc gcgtagagcg 24000 
cgcccgccag cgccacggtc cccaccgtcg cgagcgcccc gaccgcgacg aagagcagcg 24060 
catggaagat ggcgagcagg ccggcgttcc aggccaggcg ccgcgcgcgc gccacgggtt 24120 
ccacgtcata gcccaccacg aggcgccgct cgagggcggc ctgcagcacc gcgtaggcga 24180 
gcacgcagac cagcgtgccg cccagcagca cgccgaggaa acccgccaaa ccagccgtgt 24 240 
cgtcgaggat catgggacca ccagggacgt cttccggcca gtctataacg ccccgcccca 24300 
tgaaaaacgg cgaggcccgg gggcctcgcc gtggagggaa ccgcgcaacc tcagggttgc 24 360 
gggggtgcga ccgaccagcc gccggggccg atcaggtttg cgctgccgat ggtggcgagc 24420 
ccgttcatca ggcgcaccgt gacgctgccg tcggtgtggc ggaagaccag gtccttgcgg 24 4 80 
ctgtcgccgt tcaggtccag cagctgcgtc acgttccaac ccgtgcccgc cggcagcgca 24540 
tcgcccgcgg tgatgatgga aacaccatcc atcaggcgga tgtgcacgcg gccgtcggtg 24 600 
tggcggaaga ccaggtcgtc cttcgcgtcg ccgttcaggt cgcccaccag gttcaccgac 24660 
cacccgctgc cggccggcag gagctgggcg ctggcgccga aggtggtgcc gttcatcagg 24720 
aacaggtgcg cgcggccgtc ggtgtggcgg aagatcatgt ccgcgcgtcc gtcgccgttg 24780 
aggtcgccca ggtggcttac cgtccagccg ctggcggccg agaggaagcc cgctccggcc 24840 
gtcaccgtgg tgccgtccat gatgtagatg tagccgcggc cgtcgctgtg catgaagacc 24 900 
aggtccgcct tgccgtcgcc gttcatgtcg cccgcctggg tgagcgtcca gccggtcccc 24 960 
gccccgaaga gctgggcgct gccgatgatc gtcgtgccgt ccatcaacca gatgtgcgcg 25020 
cgtccgtcgg tgtggcgcag gatgatgtcc gccttgccgt cgccgttcag gtccgccgtg 25080 
tggctcacga tccagccgag cccggcgggc agcagctcct tgccggccgt caccgtagtc 25140 
ccgttcatga tgtacgcgta ggcgcggcca tcggtgtggc ggaacaggat gtcggcgcgc 25200 
aggtcgccat tgaggtcggc cgtctggctc accgtccagc ccgcgccggc gccgatcagg 25260 
ttggccgagc cggtgatggc ggtgccgttc atcgtccaca ccgcgatgcg cccgtcgctg 25320 
ttctgcagca cgaggtcgct cctgccgtcg ccgctcaggt ccgagacgac cggcttggtg 25380 
ggcggcgtga cgccaccgga aaccggcagt gccagcccgt cgatccaggc gcggtcggcg 25440 
cagttggcgc catccggggc cggcgggttc caggtcggcg agttgcaagg cgtggagagg 25500 
aagttcgaga aacgccacac cagcgtgtgc gcgcccgcgg tcaccgggaa cgacaccagc 25560 
ttccagcccg aggccgtggt gcccgcgtcg ctgaacacca ccgtgccgtc gatcaggaac 25620 
tcgaacttgc cggcgttggg gaagctcgac acgcgatagg cgaacgcgac gttgcccgcg 25680 
agcagcgtgc ccgcgaacga gaggtcggag ttcaccgtcg tcgagttggt cgggtcgctc 25740 
gtcaccacct gcgccgaacg aaggctggtc gcgccctcga aggcctggtc ggagccgacc 25800 
gtccatgcgt tggcgccgcc cgaggtcgtg aagccggcag gcagcgtgbc acccgtgggc 25860 
cacggcgaga ggatgatgcc ggtcgcggtc gccgggccgc cgatgaagac acccgtggaa 25920 
cccgtcgcat tggagagcgc gacgttgaac gtctcgttgc cttcaccgat accgtcgctc 25980 
gccaccggga ccgtgatggt cttgggcgtg gtctcgccgt tggcccagct gagcgagccc 26040 
gaggtcgcgg tgtagtcggc gcccgaagtg gcggtgccgt tggaggtggc gtagctgacc 26100 
gagatggcac ccgccgagcc gccgatgcgg ctcacggtga gcgtcacgtt gccggcggtt 26160 
tccgcggccg cgaaggtggt gccggtgaac tgcaccgaac cgggaacggc gggcggggtg 26220 
accgtggact tgagcgccga gagtgccgcg cggttgttgt tcagcgagag cgcgtcgttg 26280 
gcctgcgcgg tgccgcacgg ttgcggcgca ccgacacccg tcgtgcacga gaggcccggg 26340 
ttggagaagc ggtagacctt ggtcgtcgag tcatgcagca gcgacatgat ggtggtgaag 26400 
ttcgccgcgt cgttcgtcgt gcactcgggg cgctggctca cgccgcaggc gccgccgccg 264 60 
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gagccgttgg gcagggccga gttgcaggtg aggttgccgc tcgcgcaata gaagtggccg 26520 
tacgagtacg cgaacgcacc ttgcggcggg ctcgcggtgc cgcccgcttc ccaggcctgc 26580 
gaggggcggt cgtggcggtt gcccatggca tggccgagct cgtggatgaa gacccactcg 2664 0 
caaccgaaca ggcagccgga agcaacggcg taggcaaagt tggggtccgg cgtgctggtg 26700 
ccgatccagg ctgcgccgtt gccgccgaag tcggcgccgt tgcgcatgaa ggccaccatg 26760 
tccgcgccgt attgcgtgcg gatggcctcg atgttgccga acgtcgcggc gtcgaagctc 26820 
gcgtcgcccg gggtgatggc atgcatcgcc gtggtgtcgt cgatggcgtc gtcgtagttc 26880 
acctgcgtcg cattcaccag gcgcagcgtg atggcgacct cgctgtcggc atacgcagtg 26940 
ttggcgcgcg tgacgaggaa gttcaggcgc gtcatcaggc tgccgccgag gcgctggccg 27000 
aagccgcgcg agtacacgat catcaggtcg atgacgttct gcggcgtggg cgtcacctgg 27060 
gcgagccgac ggtattcacg ccggggatgg cgatcagggt ctcggggctg ccggtcgcga 27120 
cgacgcgcga cttcgagccg gtcgtggcca cggcgtggtc ggcgccgagg ctgatgatcg 27180 
gcatgctggc ctgctcggcc gtcatgtcga ccaggtagtc gacggtgccg cccggaatca 2724 0 
ggcggtactc accctgcggg gtcgagaaca cgccgaagga gccttcaggc ccggtcgtga 27300 
cgatcgcgcg gtggttgatg ccgtcggtct tgctcttcgc gatccacgac gtgatgccgt 27360 
cgccgtgggc ctggagcaac tcgaagacat acgagtggcg caccccgttg gggagcgaca 27420 
gctcgacctc ggagcgcgga ctcagggcat ggagggccgc ggcgttgaag cgcaccgctt 27480 
gccgggcaac ggcgctgccc ggcgccggca tcgcggcggc gggtgcggag accaggatct 27540 
ggggaaccgc ggcgaaggcc acggaagaga agacgatcgc cagcgcgccg aggaaggcgg 27600 
ctgtgatccg ttgcgtgaat gcgttcatgt ggagctccgg aagttgaccc atgcccaatc 27 660 
cgctatgtcg cggagatgtg gacaaaaggt atcgagcggg cgtgacgacc cgcccccgga 27720 
gggatgctcc aaaaggacta cgagggtgct acggctgggt tggtggcgaa ggccgaagga 27780 
gaactccttg tggtctccgc gcgacttaag gttgcgaggg aaccaccgac cacccaccgg 27840 
ggccgatcaa gtccgcgctg ccgatcgtgg tgacgccgtt catcaggcga atcgtcacgc 27900 
gcccatcgtt gtggtggaag accacgtcca tgcgcccgtc gccgttgaag tcgaggagct 27960 
gcgtcaccgt ccagcccgcg aacggcggca ggatgtcggc ggcgccgagg atcgcggtgc 28020 
cgttcatcag gcgcacgtgc gcgcggccgt cggtgtgcgc gaacacgagg tccgcccggc 28080 
cgtcgccgtt gagatctccc accaggttca ccgaccatcc ggtgcccgcg ggcaacagct 28140 
cggcgctggc gccgaacgtc gtgccatcca tcaggaacag gtgtgcgcgg ccgtcgttgt 28200 
ggcggaagac gaggtcggcc ttctggtcgc cgctgaagtc cgcgacgtgg ctcaccgccc 28260 
agccgctgcc gggcgcgagg aagcccacgc cggcggtgat cgccgtgccg ttcatgatgt 28320 
atgcgtagcc gcgcccatcg acgttggaga agacgatgtc ggccttgccg tcgccgttca 28380 
tgtcgccggt gccgaccacg ttccatcccg agcccgcggg cagcaggctc gcactgccgc 28440 
tgatcgcggt gccattcatc agccagatgt gggcgcgtcc gtccgtgtgc tgagcagcag 28500 
gtcggccttg ccgtcgccgt tcaggtccgc cgtgtggctg atggtccagc ccgtgccggc 28560 
ggggaacagc tccttgccgc ccacgaccgt gagcccgttc atctggtacg tgtagatgcg 28620 
cccgtcggtg tgctggaagt agatgtcggc ctggctgtcg ccgttggaat ccgcggccat 28 680 
tgatcacgct ccagccggcg ccggcgggaa tcaggttcgc cgaaccagtg atcgtcgttc 28740 
cgttcatggt ccacgcggcg atgcgcccat cggtgtgctg gaacaggaag tcgctctgcc 28800 
cgtcgccgtt caggtcggtc gccgcatggg tcgggaacgc caccttcgcg aggaaggcat 28860 
cggagaaacc cgtgagcaac gtcttgtacg cgccgggcgt ggtcggatag ttcgccgacg 28920 
acgtgcggcc ggccacatag gccgcgcccc gcgcatcgac gccgacgccg taccccacct 28980 
cgacgttgtt gccgccgatc agcgtcgaga acacgatgtc gccgttgcgc acgcgagtga 29040 
ggaacgcgtc ggcatcgccg ccggggccct gcaccgggtt cagggtcggg aagccggcga 29100 
cctgcgtgta gcccacgacg agcgcatcgc cattcggcgc cagggcgacg tccaggggat 29160 
tgtcgtaggc cgagccgccg aagatgcccg acgattcgag cgcgccggtg gccgagaagc 29220 
gcgtcacgaa cgcatcgacg gtgccggcga aattgcgcac cgttccggtc tgcgggaagc 29280 
tgggcaggcc ggtatcgccg acgacggtcg cctggcccgt gaccgggttc accgcgatcg 29340 
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cggtcacgtt gtcgccgctg ccgccgagga acgtggaata cgcgaccggg ccgccgctcg 29400 
caccgatctt ggtgacgaag ccgccaccgg gttgggtctg gtacgcattc acgttgcgga 294 60 
acgtcgagga gcgcgtggag cccgcgacgt acgtgttgcc cgcattgtcc gttgcgatac 29520 
cgcggccgat gtcatcaccg ctgccgccga gatacgtgga atagacgaag ccgccgttga 29580 
acgagagctg gtgacgaacg cgtcgcgatt gaggccggtg ccgacgaaca cggctgcagc 29640 
gcgctgccgc cggtcgggaa gccgcgcgcg gccgaacccg tgacatgcat gaagccgttg 29700 
ttgtccacgg cgatcgccat gccctcgtcg tcgccggagc cgcccagcag ccgcgagaag 297 60 
atgacgttgc cagtggtatt cagcgccgtg acgaatgcat cgcgcccgcc ggcatggttg 29820 
gtgcaatcgg tgacggcgaa gcaatgcggg aacgcgttgg tgccggtgga accggtgatc 29880 
gcgacgaact cgttgccgcc ggaggcgccg agcgccagcc cgcggagata cgcttcggcc 29940 
ccgagcgtgg tcgagtacac cacctgctgc gtggtggggt tgaacttcac caccgtggct 30000 
tcgtacgtgc ctgcggtata ggtcgcgagc ccgacgtaca cgttgccctg cgcgtccacc 30060 
ttgaccggcg tggtccactg ctcgtcgccg gcgccgccca ggtacgtgga gaacgccatc 30120 
accgggtcga tcaccagcgt gcgggtgctg tcgtagtcgg cgatcacgaa gccggcctcg 30180 
gcgccctgct cgcccacgaa gagctcgaag cgcgcggcca ccggcacgcg cgaatcgccg 30240 
acctgctgga aggctaccgg cgcgtgctgc gtaaattcct cctggcccac gcgcatgcgc 30300 
aggttgccct cgccatcgat ccacgcgtgg tcggcggacg agaggtcgag gcggatctgg 30360 
cgcggatcgg cgcgcggggc gaccacgaag tcgtattcga gcgtgccctc cttgccgtac 30420 
acggtgaggt ccacgccgcg atacaggtcc ttcagcgaca cgcggccgaa gtgcggcacg 30480 
ttttcgcgcc ggccggcggc cgtggtgccg ctgtaatagt ggctgagggt ggcccggggc 30540 
tcctcggcct cgatgaccgg atcggaagcc cccgcgaagc gcacgcgcaa cagcgaggcg 30600 
ctggcagctt tcctggagcg tggctcgaag gcaatgccat cgcggctcac ggagacgcgt 30660 
ccggcctgcc cgcgcgagac gtagagcgca tccgggccga actggccctc gttgcgctcg 30720 
aacgtgatgg gcgcgttcct gagggcatca gggacgacgg catggacggc ggcgggagtg 30780 
gcgagggcgg ccgcgacgag cacggccagg gggcggagcg ccgggcgggc gccgctggag 30840 
ctgcgggtca tggggcagtg acctttctgt tgttatggct gcttgttgtc tttgcgggat 30900 
ccacagctcg gatccagctg gcgatactac cagcgagcga gtggtttttc gtgacccagt 30960 
cccgcagtcg ggccccggcg acaacggcgg catcgcggtc ctttgccagg gcacgcacgg 31020 
cctcgatcca gtcgccgtcg cgggccagca cggcgggggc atcgcggtag ggctcgagat 31080 
ccgaggccac caccgggatg ccgaggcagc cgtactcgag cagcttcagg ttgctcttcg 3114 0 
cgcgattgaa cgggttgtcg cgcagcggtg ccaccgcgac gtcgagtgcc agcgacgcga 31200 
gcttctccgg gtactgcgcg atgggcacca tgtcgtgcac ttccgcggcg aacggcgcca 31260 
gctcgggcgt gcacaggccg aggaacaccc agtcgatctc gcgatgcgtg gcgcgcacca 31320 
cgggctcgag caacttcagg tcttccccat gctgtttcgc ccccgcccag cccacgcgcg 31380 
gacgcgcgcc acccgcggga cggttcgcga ggccggccca acgctccgca tcgatcgcgt 314 40 
tcgggatcac gcgcacgtcc ttcgcgccgc ggccgaaggc ctccgcgagc ggtgcggtgg 31500 
acaccacgag gcggtcgcac aacgccaccg cgcgcgcgat gcgctgggcg atgtccgggt 31560 
agatcgtcgc cgcatacgga ttgccgggcg gcagttgcgt gagcaggtca tccaggccca 31620 
ggaccttcag cgcgcgtccg tggcgcgcga gcacctcgag cgaggtgagc tggtagttgt 31680 
ggaagaagtt gtgcgcgagc acggcgtcgg cgtcgaggcg ctgccactcg acgcggttgg 31740 
gcgcgcagcc cgtggcgtgc tcgcccatca tcacgacctc gacgcggccc gcgcgctcca 31800 
gcgccgcgca cggctggcgc acgcgcacct cgccggagcc ccagcggtcg aacggaaacg 31860 
cgcagagccg aaccgcgtgc gtgtcgttgc cgcgcggcgc gaagcgttcc acgggcccgt 31920 
agccctcgcc gcccatgcgc aacgccgggt ggtagtgcgg atcgtcgtcg agcaccggct 31980 
gccagcgctc gcgcatccac gcctcttcgg aaggaatgac gagcttcgcc accggaacga 32040 
cggcatcggc gcgcgcatcg gcgagcaacg ggaagtccgc cgcgatgccg gggcgcgcaa 32100 
ggatgtcgaa gcccgctgcg cgcagcccga ggcacaggtg cgccatcgcg aacggccccg 32160 
cgcgctcgat ctcgtgcagc gcctgcgcgg agagcgccgc gtcgcggttg accagcgcga 32220 
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cg ccct g c g c gcgcggcgga gcgcccgaga cgcgagggc a cccgtggagg 32400 

agccgggaac gcgcacaccg ccggggccca ^aattcg ctgcaacgcg 32460 

gg cct g cgat gccctgctcg aggcgctcga ^cgg ^cgcgcg aacgcggagc 32520 
catcgacgat cgcgatccac agcgtcgccg ^gcggat <* *J agctcaccgg 3 2580 

cggtttcgcc gcccaccgga accaggtgcg tgcgcacgcc «« acgttcgcg g 32640 

gccagcgttt cggcgccctt cgcgtcgaga tcgacgtaca ^ cgtcgcgtca 32700 

Lgcccttca cgagcgcctc gatgcacgcc ^ot ccg.cg ^ ^ 
atgaaggcgc aaatcgtgcc ggccccgatc ^cg gacca c g cgo 32820 

g cagctggca ggaacgcggc ctgcgtgcgc £££££ tgt g tcaggtc ga ggcgatgc 32880 
tggtggcgct cgcgcgcggc ^cgctttcc W« * c gaggcgcagc 32940 

acgagcagct cgcgcagtgc caccacggcg ^cgcacgg cag caacgca 33000 

gc gagatcgg tgacgcccgc ccagccgccg g g cgcgaa ctc cgcgccgaac 33060 

gcgcggcgca gcaccgcgag cccggcgagc ^^ggtgtc 5 5 5 actgcagccg 33120 

^ggacga acgccggatc gtagcgttcg ^ cca 33 180 

tagagaagct cggcgttcgg gttgcgtgcg -cgcacccg J qatc 3 3240 

tcggccagcc ggtcgcccto ctccaccagc gctcgaggac cacggtctcc 33300 

gcgcggttga cgcgcccgag egcogegtog ctcgcg g * ^ gtggtcgtag 33360 

agggcgatcg cggcagtgct cctggccggc **^ cg « « cgccgcggcg 33420 

//atcgccgc ccagcggcgt gcgctcgcgg c^atg^ ^ cgccgccagc 33 480 
aacaacggat cgatcgttgc gggcac.gca tccaccg g gcggtggtac 33540 

tcgcgataga gcgccaggta ctcggccgca "cgcctcga -J cagcaggagg 33600 

ttcaggttct ccgcgacgcg atcgagcgcc jtgcggtcgt atcgcgcaC g 33660 

gC cacgatcg cgccgggatc «ttgtjowo ^ ctgtgcctcg 33720 

cgctcgggaa tggcgccatc gcgcgtgccc 5ccaccggg g 9 gtcgatgccg 33780 

ctcaacacca ggcagaaggt ctcctccatc ^cgagggca *^ cgca 3 3840 

cgaaggatgc gcggcaggtc cgtcgcggca t«g**oe»o * gatgcgcfcgc 33900 

ttcgccgccg cttcgagcgc gagatcgcgc ^accccagg * gccgaagcgg 33 960 

cccgccagcc ggcgcgcggc ctcgatgatc ^tgcgcgc cggcaggtcg 34020 

ccg a g gaagg ccacgcgcag caccggcogc ^tteg^o gjg ^ ^ 
gcaatgccgt gcccgatgac gcgcagcttc cggC cgcgtc cacggcatcg 34140 

atgcgcatcg ccgcgtactc ggacgggcag ggcgc cgtcggcccc 34200 

gacgccgcgg cgaagcgcga ttcgaggtag ccggtca gg gccgcgcgc g 34260 

acgatcgcct cgctgcggcc gcgcaggcaa tccacgcatc 5 ggcgcaC agg 34320 

g c g gcgcggc cgcacggcag Otoctgcggc ^« .^Lgcgo gatgcgcggc 34380 
tacgagaagt cgtgcgccgt ******* cgacgtcgta gcccccggac 34440 

aggcgcaggc tgttccagcc caccaacgac c g cgccgcctc gtg cccgato 34500 

agcagttcgc ggaaggccgc ctccacgccc gtg c g tcgcc 34560 

acgcggatgc cgggcgtcat gcgctcgggg cgggcagcac cagcgtcgag 34620 

aggcgctcgg tggtgaagtc ggtccacgcg ^cgggccct * qcqc 34680 

gC gacgtcct tgcgcagcag gttcaccagc Lggctccg ctcgcgctcg 34740 

g.gtcgaggo gatgcagcac ctgcagcacg ^cgggcggc ^ ccacgcg cgc 34800 

gccgcggcgt tgacgcgctc ggcggcgcgc ^ttggc ctcgcgcgcg 34860 

acgcccgcga cgtaggtggg ccagcggcgc tcgaggcgca gtcg 34920 

gcgt cgatgc cggccacccc , = 34g80 
tcgcagcacg cgatctgcca gtcccggtcc agcgcg g c ctcgcggcgc 35040 

= 9 .r.= zssz — « — 35100 
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gcgacgcagg cggagatggc atcgacgtcc ggctcgccgt cgggccgctc gaagatatcg 35160 
gcgacgccgg gcaggctgag gatcgtcgcg cggttggaga gcgggcacac gatgccgacc 35220 
cgcgcatcgc tgtcgcggca acgcagcagg ccctcgagcc aaccggggcc gacctgcgta 35280 
tccgaattga gcaccacgaa gtcgccgccg ctgccggcca ggcccgcagc gaccgcgcct 35340 
gcgaagccga ggttcttcgc attggtgacg acacgagtgc cggcacgacg cgccgcgaag 35400 
ctgcccagca acccggcgag gcgcggatcg gtgctcgcgt catcgacgag gacgatgccg 354 60 
tgcgcggcgc ccgtgtggcg ggccagcgac gcgaggcagc gttcggtgtc gtcgtggccg 35520 
ttgaagaccg ggacgacgac tagcgccggc gccatccgac cggcgcttcg agctgggcca 35580 
atcgcgccag caggcgctgg tgctcctgcg agagtgcctc gagctgcttc tgcgaggcga 35640 
cgtaggcgcc tccatgcgct cccaggtgcg tgcgcgcctc ctcgagctgc cgcagcgact 35700 
gctcgtggcg accggagacg tcgtggtcgc ggtcggagag cagcgagaag cgcccggccc 35760 
ccgccgtcac cgcctcgcgc tcgccgcaca gggcgaggaa gtacatcgca tcggcgacgc 35820 
ccgccgcggg tttcgcagcc tccggcgccg aagcttgcag caactgcacg gcgtcgggca 35880 
ccgcgtgcag cggccagatc gccgaatacg catcgacgcg ttgcccgaac agggccacct 35940 
ccgggaaccg cgccttcagc acggcgagga attcgtgctc gtagagctcg cgcacgtgat 36000 
gcgggttgcg gtagtcgcgc tggtccgagt acacctcgcg gttcggcgtc gagaccagca 36060 
gcaggccgcc gggcgccagc acgcgcttcg cttcgtcgag caggcgctcg ggatcgggaa 36120 
tgtgctcgag cgtctcgaag gagacgagga gatcgatgct tgcgtcatcg cacggcagcg 36180 
actcgcagcg gccctcgacg tactgcaggt tctgcgcggc accatagcgg cggcgcgcct 36240 
gcgcgatggt ctcggcggcg acatccgcgc ccaccacgga cttcgcgcgc gtcgccagca 36300 
gcgccgagcc gtagccctcc ccgcaggcga tgtcgaggac gcggcagcct ccggcgagcg 36360 
gcaacgcaaa gtggtagcgg tgccagtgct cgtaccagat ctcgccctcg aagccgggct 36420 
ggaaacgttc gtgttccatc gggagagtat aggtggatgt gcaaccgccc tccggagagg 364 80 
gcggttgcga tgatcttatt cgagatcggc tgctacggaa gcaccggcaa cggctgcggc 36540 
ggcgccaccg accatccggt gcccgcgccc agcaggttcg cgctgccgag cgtggtgagg 36600 
ccgtccatca ggcgcaccgt gatgcggcca tcgacattct tgaacaccat gtcgagcttg 36660 
ccgtcgccat cgaagtcgag cagttgcgtg acggtccagc ccgcccccgc cggcagcacg 36720 
tccgccgcgc tgaggatggc ggtgccgttc atcaggcgca cgtgcgcacg gccgtcggtg 36780 
tggcggaaca cgatgtcacc gcgcccgtcg cggttcatgt cgcccacgtg gctcacgatc 36840 
cagccggtgc ccgcgccgag gagctcggaa ccggcgccga acgcggtgcc gttcatgatg 36900 
aagagatgcg cgcggccatc ggtgtggcgg aagaacatgt ccgccttgcc gtcgccgctc 36960 
acgtcgccca ggtgcgtcac cgtccagccg ctggcgggcg agaggaagcc cgcgccggcg 37020 
gtgatggtgg tgccgttcat caggtagatg tagccgcggc cgtcagcgtg gatgaagacg 37080 
atgtcgtccc tgccgtcgcc gttgaggtcc ccggtggcca cgactctcca gcccgtcgcc 37140 
ggccccagca gctgctggct gccgatgatg gccgtgccat ccatcagcca caggtgcgag 37200 
cggccgtcgg cgttgcgcag caggagatcc gccttgccgt cgccgttcat gtcggcggtg 37260 
cggtcgatgc tccagccgag gccggcgccc agcagctcct tgccgccggt caccgtgagg 37320 
ccgttcatcg tgtacacgta cacgcgcccg tcggtgtggt gggatcctct agagtcgacc 37380 
tgcaggcatg caagcttgag tattctatag tctcacctaa atagcttggc gtaatcatgg 37440 
tcatagctgt ttcctgtgtg aaattgttat ccgctcacaa ttcacacaac atacgagccg 37500 

\ " 37507 

gaagcat 



<210> 3 
<211> 26 
<212> DNA 

<213> Artificial Sequence 



^^PCT/EP2003/007765 

WO 2004/013327 JJ^ 28/32 

<220> . ,- nn of Artificial Sequence: Sense primer. 

<223> Description of M» w 

26 

<400> 3 

ggsccskcss tsdcsrtsga ya«9 c 



<210> 4 
<211> 23 
<212> DNA 

<213> Artificial Sequence 

<220> f artificial Sequence: Antisense 

<223> Description of Artificial 

primer . 

23 

<400> 4 



gcbbssryyt 



cdatsggrtc sec 



<210> 5 
<211> 773 
<212> DNA 

<213> Artificial Sequence 



<220> • of Artificial Sequence: OriT derived 

<223> Description of Artiix 
from plasmid RP4 . 

<400> 5 a ^ractctt cttgattgga gcgcatgggg SO 

gatctgtgat gtacttcacc agctccgcga -£cgotc£ ^ gtca tggctc 120 

fcgtgcttgg caatcacgcg cacccccc W ^ ^ 

tgccctcggg cggaccacgc accgaaccgc gccgtgcgcg ggtcgtcggt 240 

cttcgccagc agggegagga tcgtggcatc J CC attgatgc gggccagctc 300 

gagecagagt ttcagcaggc cgcccaggcg geccagg g cg g CC agcag 360 

gcggacgtgc tcatagtcca cgacgcccgt ££££ « tcaatcg ctc ttcgttcgtc 420 
gtaggecgae aggctcatgc cggccgccgc g ttg gcttgg tttcatcagc 480 

tggaaggcag tacaccttga taggtgggct cagcctcgca gagcaggatt 540 

dtccgcttg ccctcatctg ttacgcc.gc ggtagc gg ^ ^ 

cccgttgagc accgccaggt 9<^ at ^ Lgccgttgg atacaccaag gaaagtctac 660 
t gggcctact tcacctatcc ^aaaggat, ^ gatggatata ccgaaaa aat 720 
aC gaaccctt tggcaaaatc ageggaaaag ateegtegga tct 7™ 

cgctataatg accccgaagc agggttatgc agegg 



<210> 6 
<211> 2012 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Psi C31 
integrase. 

<400> 6 

agatctcccg tactgacgga cacaccgaag ccccggcggc aaccctcagc ggatgccccg 60 
gggcttcacg ttttcccagg tcagaagcgg ttttcgggag tagtgcccca actggggtaa 120 
cctttgagtt ctctcagttg ggggcgtagg gtcgccgaca tgacacaagg ggttgtgacc 18 0 
qgggtggaca cgtacgcggg tgcttacgac cgtcagtcgc gcgagcgcga gaattcgagc 24 0 
gcagcaagcc cagcgacaca gcgtagcgcc aacgaagaca aggcggccga ccttcagcgc 300 
gaagtcgagc gcgacggggg ccggttcagg ttcgtcgggc atttcagcga agcgccgggc 360 
acgtcggcgt tcgggacggc ggagcgcccg gagttcgaac gcatcctgaa cgaatgccgc 420 
gccgggcggc tcaacatgat cattgtctat gacgtgtcgc gcttctcgcg cctgaaggtc 480 
atggacgcga ttccgattgt ctcggaattg ctcgccctgg gcgtgacgat tgtttccact 54 0 
caggaaggcg tcttccggca gggaaacgtc atggacctga ttcacctgat tatgcggctc 600 
gacgcgtcgc acaaagaatc ttcgctgaag tcggcgaaga ttctcgacac gaagaacctt 660 
cagcgcgaat tgggcgggta cgtcggcggg aaggcgcctt acggcttcga gcttgtttcg 720 
gagacgaagg agatcacgcg caacggccga atggtcaatg tcgtcatcaa caagcttgcg 780 
cactcgacca ctccccttac cggacccttc gagttcgagc ccgacgtaat ccggtggtgg 84 0 
tggcgtgaga tcaagacgca caaacacctt cccttcaagc cgggcagtca agccgccatt 900 
cacccgggca gcatcacggg gctttgtaag cgcatggacg ctgacgccgt gccgacccgg 960 
ggcgagacga ttgggaagaa gaccgcttca agcgcctggg acccggcaac cgttatgcga 1020 
atccttcggg acccgcgtat tgcgggcttc gccgctgagg tgatctacaa gaagaagccg 1080 
gacggcacgc cgaccacgaa gattgagggt taccgcattc agcgcgaccc gatcacgctc 1140 
cggccggtcg agcttgattg cggaccgatc atcgagcccg ctgagtggta tgagcttcag 1200 
gcgtggttgg acggcagggg gcgcggcaag gggctttccc gggggcaagc cattctgtcc 1260 
gccatggaca agctgtactg cgagtgtggc gccgtcatga cttcgaagcg cggggaagaa 1320 
tcgatcaagg actcttaccg ctgccgtcgc cggaaggtgg tcgacccgtc cgcacctggg 1380 
cagcacgaag gcacgtgcaa cgtcagcatg gcggcactcg acaagttcgt tgcggaacgc 14 4 0 
atcttcaaca agatcaggca cgccgaaggc gacgaagaga cgttggcgct tctgtgggaa 1500 
gccgcccgac gcttcggcaa gctcactgag gcgcctgaga agagcggcga acgggcgaac 1560 
cttgttgcgg agcgcgccga cgccctgaac gcccttgaag agctgtacga agaccgcgcg 1620 
gcaggcgcgt acgacggacc cgttggcagg aagcacttcc ggaagcaaca ggcagcgctg 1680 
acgctccggc agcaaggggc ggaagagcgg cttgccgaac ttgaagccgc cgaagccccg 17 40 
aagcttcccc ttgaccaatg gttccccgaa gacgccgacg ctgacccgac cggccctaag 18 00 
tcgtggtggg ggcgcgcgtc agtagacgac aagcgcgtgt tcgtcgggct cttcgtagac 18 60 
aagatcgttg tcacgaagtc gactacgggc agggggcagg gaacgcccat cgagaagcgc 1920 
gcttcgatca cgtgggcgaa gccgccgacc gacgacgacg aagacgacgc ccaggacggc 1980 
acggaagacg tagcggcgta gcgagacacc eg 2012 



<210> 7 
<211> 84 
<212> DNA 

<213> Artificial Sequence 
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TZ Description of artificial Sconce: Left «. of 
P PA0I6 transposon. 

<400> 7 ctcaaqtgtt ctcgcatatt 60 

ctgtctctta tacacatctc aaccatcatc gatgaatttt ctcggg g ^ 

ggctcgaatt cgagctcggt accc 

<210> 8 
<211> 94 
<212> DNA 

<213> Artificial Sequence 

<220> ^ Artificial Sequence: Rigth arm of 

<223> Description of Artificial ^4 

pPAOI6 transposon. 

;r ct 8 c t a g a^acct, ca gg cat g ca a^c.a c.actacgca cta.ccaaca 60 
agagcttcag gg tt g a g at g tgtataagag acag 



<210> 9 
<211> 31 
<212> DNA 

<213> Artificial Sequence 

<220> n * Artificial Sequence: Primer AmF. 

<223> Descriptxon of Artiricia-i. m 

<400> 9 31 
ccctaaqatc tggttcatgt gcagctccat c 



<210> 10 
<211> 34 
<212> DNA 

<213> Artificial Sequence 

<220> n f Artificial Sequence: Primer AmR. 

<223> Description of ArtiticiaJ- h 

<400> 10 34 
tagtacccgg ggatccaacg tcatctcgtt ctcc 



<210> 11 



^^PCT/EP2003/007765 

WO 2004/013327 31/32 



<211> 32 
<212> DNA 

<213> Artificial Sequence 



<220> - ar-Mficial Sequence: Primer OriTF. 

<223> Description of Artificial * 

<400> 11 . „_ 32 

gcggtagatc tgtgatgtac ttcaccagct cc 

<210> 12 
<211> 37 
<212> DNA 

<213> Artificial Sequence 

<220> v „* Artificial Sequence: Primer OriTR. 

<223> Description of Artiticiax 

37 

tlgtacccgg ggatccgacg gatcttttcc gctgcat 

<210> 13 
<211> 32 
<212> DNA 

<213> Artificial Sequence 

<220> • artificial Sequence: Primer Fint. 

<223> Description of Artificial t>eq 

<400> 13 32 
aacaaagatc tcccgtactg acggacacac eg 



<210> 14 
<2H> 21 
<212> DNA 

<213> Artificial Sequence 

oescription of «t i£ ici.l s eq »e»ca: Primer ». 

<400> 14 21 
cgggtgtctc gcatcgccgc t 



<210> 15 



WO 2004/013327 



• ^^PCT/EP2003/007765 
32/32 



<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
amyE-BamHI 

<400> 15 30 
atcgcaggat cctgaggact ctcgaacccg 



<210> 16 
<211> 37 
<212> DNA 

<213> Artificial Sequence 



<223> Description of Artificial Sequence: Primer 



amyE-EcoRI 



<400> 16 

cgactgaatt cagatctagc gtgtaaattc cgtctgc 



37 
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